Virtualized Radio Access Network (vRAN) is one of the key enablers of future wireless networks as it brings the agility to the radio access network (RAN) architecture and offers degrees of design freedom. Yet, it also creates a challenging problem on how to design the functional split configuration. In this paper, a deep reinforcement learning approach is proposed to optimize function splitting in vRAN. A learning paradigm is developed that optimizes the location of functions in the RAN. These functions can be placed either at a central/cloud unit (CU) or a distributed unit (DU). This problem is formulated as constrained neural combinatorial reinforcement learning to minimize the total network cost. In this solution, a policy gradient method with Lagrangian relaxation is applied that uses a stacked long short-term memory (LSTM) neural network architecture to approximate the policy. Then, a sampling technique with a temperature hyperparameter is applied for the inference process. The results show that our proposed solution can learn the optimal function split decision and solve the problem with a $0.4\%$ optimality gap. Moreover, our method can reduce the cost by up to $320\%$ compared to a distributed-RAN (D-RAN). We also conclude that altering the traffic load and routing cost does not significantly degrade the optimality performance.
翻译:虚拟无线电接入网络(VRAN)是未来无线网络的关键推进器之一,因为它能为无线电接入网络架构带来灵活性,并提供了设计自由度;然而,它还在如何设计功能分割配置方面造成了一个具有挑战性的问题。在本文中,提出了一种深层强化学习方法,以优化 vRAN 中的功能分割功能。正在开发一种学习范式,以优化RAN中功能的位置。这些功能可以放在中央/库德单元或分布式单元(DU)中。这个功能可以将问题发展成有限的神经组合强化学习,以尽量减少网络总成本。在这个解决方案中,使用拉格兰杰放松的政策梯度方法,使用堆叠长的短期内存(LSTM)神经网络架构来接近政策。然后,对推断过程应用温度超参数取样技术。结果显示,我们提议的解决方案可以学习最佳功能分割决定,用0.4美元的最佳差距来解决问题。此外,我们的方法可以将成本降低到320美元,而我们的方法可以大幅降低交通压压值,比分配的运行成本也降低到320美元。