Virtualized Radio Access Networks (vRANs) are fully configurable and can be implemented at low cost over commodity platforms offering unprecedented network management flexibility. In this paper, a novel deep Reinforcement Learning (RL)-based framework is proposed that jointly reconfigures the functional splits of the Base Stations (BSs), resources and locations of the virtualized Central Units (vCUs) and Distributed Units (vDUs), and the routing for each BS data flow. The objective is to minimize the long-term total network operation cost while adapting to the possibly-varying traffic demands and resource availability. Testbed measurements are performed to study the relations between traffic demands and computing resource utilization, which reveal that their relations have high variance and dependence on platform and platform load. Hence, acquiring the perfect model of the underlying vRAN system is highly non-trivial. A comprehensive cost function is formulated that considers resource overprovisioning, instantiation and reconfiguration and the declined demands, where such impacts urge to perform the reconfigurations prudently. Motivated by these insights, our solution framework is developed using model-free multi-agent RL, where each agent controls the configurations of each BS. However, each agent has a multi-dimensional discrete action space due to the joint configuration decision of the BS. To overcome the curse of dimensionality, incorporation of Dueling Double Q-network with action branching is applied at each agent. Further, the agent learns its optimal policy to select an action that reconfigures the BS independently. Simulations are performed using O-RAN compliant model. Our results show that the framework successfully learns the optimal policy, can be readily applied to different vRAN systems via transfer learning, and achieves significant cost savings of the benchmarks.
翻译:虚拟无线电接入网络(VRANs)是完全可配置的,可以在成本低廉的商品平台上实施,提供前所未有的网络管理灵活性。本文提出一个全新的深度强化学习(RL)框架,以联合重组基地站的功能分割、虚拟中央单位(vCUs)和分布式单位(vDUs)的资源和地点,以及每个BS数据流的路径。目标是最大限度地降低长期的网络总运行成本,同时适应可能变化的交通需求和资源可用性。进行测试测量,以研究交通需求与计算资源利用之间的关系,这表明它们之间的关系在平台和平台负荷上有很大差异和依赖性。因此,获得基础的 vRAN 中央单位(vCUs) 和分布式单位(vDUs) 的完美模式,并设计出一个考虑资源过度、即时速和重组以及下降需求的综合成本功能。在这种影响需要下,可以顺利地进行平稳的重组。根据这些认识,我们开发的解决方案框架是使用无模式的多版本服务器,每个版本的服务器都使用一个自动应用的系统,每个版本的系统都显示一个可操作动作。