Mobile networks are composed of many base stations and for each of them many parameters must be optimized to provide good services. Automatically and dynamically optimizing all these entities is challenging as they are sensitive to variations in the environment and can affect each other through interferences. Reinforcement learning (RL) algorithms are good candidates to automatically learn base station configuration strategies from incoming data but they are often hard to scale to many agents. In this work, we demonstrate how to use coordination graphs and reinforcement learning in a complex application involving hundreds of cooperating agents. We show how mobile networks can be modeled using coordination graphs and how network optimization problems can be solved efficiently using multi- agent reinforcement learning. The graph structure occurs naturally from expert knowledge about the network and allows to explicitly learn coordinating behaviors between the antennas through edge value functions represented by neural networks. We show empirically that coordinated reinforcement learning outperforms other methods. The use of local RL updates and parameter sharing can handle a large number of agents without sacrificing coordination which makes it well suited to optimize the ever denser networks brought by 5G and beyond.
翻译:移动网络由许多基站组成,每个基站都必须优化许多参数,以提供良好的服务。自动和动态优化所有这些实体都具有挑战性,因为它们对环境的变化十分敏感,并且可以通过干扰相互影响。强化学习(RL)算法是自动学习从输入的数据中基地站配置战略的好选择,但它们往往很难向许多代理商推广。在这项工作中,我们展示了如何在涉及数百个合作代理商的复杂应用中使用协调图和加强学习。我们展示了如何利用协调图对移动网络进行建模,以及如何利用多剂强化学习有效解决网络优化问题。图表结构自然地来自对网络的专家知识,并允许通过神经网络代表的边缘值功能明确学习天线之间的协调行为。我们从经验上显示,协调增强学习其他方法是困难的。使用本地RL更新和参数共享可以处理大量代理商,而同时又不牺牲协调,从而使它更适合优化5G及以后带来的不断稠密的网络。