The Massive Multiple-Input Multiple-Output (M-MIMO) is considered as one of the key technologies in 5G, and future 6G networks. From the perspective of, e.g., channel estimation, especially for high-speed users it is easier to implement an M-MIMO network exploiting a static set of beams, i.e., Grid of Beams (GoB). While considering GoB it is important to properly assign users to the beams, i.e., to perform Beam Management (BM). BM can be enhanced by taking into account historical knowledge about the radio environment, e.g., to avoid radio link failures. The aim of this paper is to propose such a BM algorithm, that utilizes location-dependent data stored in a Radio Environment Map (REM). It utilizes received power maps, and user mobility patterns to optimize the BM process in terms of Reinforcement Learning (RL) by using the Policy Iteration method under different goal functions, e.g., maximization of received power or minimization of beam reselections while avoiding radio link failures. The proposed solution is compliant with the Open Radio Access Network (O-RAN) architecture, enabling its practical implementation. Simulation studies have shown that the proposed BM algorithm can significantly reduce the number of beam reselections or radio link failures compared to the baseline algorithm.
翻译:摘要:大规模多输入多输出(M-MIMO)被认为是5G和未来6G网络中的关键技术之一。从信道估计的角度来看,特别是对于高速用户,利用静态波束集,即波束网格(Grid of Beams,GoB)来实现M-MIMO网络更容易。在考虑GoB时,重要的是将用户正确分配到波束中,即执行波束管理(BM)。BM可以通过考虑有关无线电环境的历史知识来加强,例如避免无线电链路故障。本文旨在提出这样一种BM算法,该算法利用存储在无线电环境图(Radio Environment Map,REM)中的位置相关数据。它利用接收功率图和用户移动模式,在不同的目标函数下,例如最大化接收功率或最小化波束重新选择并避免无线电链路故障的条件下,利用强化学习(Reinforcement Learning,RL)所采用的策略迭代方法来优化BM过程。所提出的解决方案符合开放式无线电接入网络(Open Radio Access Network,O-RAN)架构,可以实现其实际应用。仿真研究表明,与基线算法相比,所提出的BM算法可以显著减少波束重新选择或无线电链路故障的数量。