Setting the transmit power setting of 5G cells has been a long-term topic of discussion, as optimized power settings can help reduce interference and improve the quality of service to users. Recently, machine learning (ML)-based, especially reinforcement learning (RL)-based control methods have received much attention. However, there is little discussion about the generalisation ability of the trained RL models. This paper points out that an RL agent trained in a specific indoor environment is room-dependent, and cannot directly serve new heterogeneous environments. Therefore, in the context of Open Radio Access Network (O-RAN), this paper proposes a distributed cell power-control scheme based on Federated Reinforcement Learning (FRL). Models in different indoor environments are aggregated to the global model during the training process, and then the central server broadcasts the updated model back to each client. The model will also be used as the base model for adaptive training in the new environment. The simulation results show that the FRL model has similar performance to a single RL agent, and both are better than the random power allocation method and exhaustive search method. The results of the generalisation test show that using the FRL model as the base model improves the convergence speed of the model in the new environment.
翻译:确定5G电池的传输功率设置是一个长期的讨论议题,因为优化电源设置有助于减少干扰,提高用户服务质量。最近,机器学习(ML)法,特别是强化学习(RL)法的控制方法受到了很多关注。然而,关于经过培训的RL模型的概括能力的讨论很少。本文指出,在特定室内环境中培训的RL代理器依赖房间,不能直接为新的多元环境服务。因此,在开放无线电接入网(O-RAN)中,本文件建议采用基于联邦强化学习(FRL)的分布式细胞功率控制方案。不同室内环境中的模型在培训过程中被汇总到全球模型中,然后中央服务器又将更新的模型反馈给每个客户。该模型还将用作新环境中适应培训的基础模型。模拟结果表明,FRL模型的性能与单一RL代理器相似,而且两者都比随机权力分配方法和详尽搜索方法要好。一般化模型的结果表明,使用FRRL模型将使用新的环境模型改进新模型的趋同速度。