Future generations of mobile networks are expected to contain more and more antennas with growing complexity and more parameters. Optimizing these parameters is necessary for ensuring the good performance of the network. The scale of mobile networks makes it challenging to optimize antenna parameters using manual intervention or hand-engineered strategies. Reinforcement learning is a promising technique to address this challenge but existing methods often use local optimizations to scale to large network deployments. We propose a new multi-agent reinforcement learning algorithm to optimize mobile network configurations globally. By using a value decomposition approach, our algorithm can be trained from a global reward function instead of relying on an ad-hoc decomposition of the network performance across the different cells. The algorithm uses a graph neural network architecture which generalizes to different network topologies and learns coordination behaviors. We empirically demonstrate the performance of the algorithm on an antenna tilt tuning problem and a joint tilt and power control problem in a simulated environment.
翻译:未来几代移动网络预计将包含越来越多的天线,其复杂性和参数日益增大。优化这些参数对于确保网络的良好运行是必要的。移动网络的规模使得使用人工干预或手工设计的战略优化天线参数具有挑战性。强化学习是应对这一挑战的一个很有希望的技术,但现有方法往往使用本地优化来扩大网络的部署规模。我们建议采用新的多试剂强化学习算法,以优化全球移动网络配置。通过使用价值分解方法,我们的算法可以从全球奖励功能中培训,而不是依赖网络在不同细胞之间性能的临时分解。算法使用一个图形神经网络结构,该结构可概括不同的网络结构,并学习协调行为。我们从经验上展示天线倾斜调问题和模拟环境中的联合倾斜和电控问题算法的性能。