利用强化学习和无线电环境地图,通过基地站转换,提高大型MIMO网络的能源效率 (Increasing Energy Efficiency of Massive-MIMO Network via Base Stations Switching using Reinforcement Learning and Radio Environment Maps)

Energy Efficiency (EE) is of high importance while considering Massive Multiple-Input Multiple-Output (M-MIMO) networks where base stations (BSs) are equipped with an antenna array composed of up to hundreds of elements. M-MIMO transmission, although highly spectrally efficient, results in high energy consumption growing with the number of antennas. This paper investigates EE improvement through switching on/off underutilized BSs. It is proposed to use the location-aware approach, where data about an optimal active BSs set is stored in a Radio Environment Map (REM). For efficient acquisition, processing and utilization of the REM data, reinforcement learning (RL) algorithms are used. State-of-the-art exploration/exploitation methods including e-greedy, Upper Confidence Bound (UCB), and Gradient Bandit are evaluated. Then analytical action filtering, and an REM-based Exploration Algorithm (REM-EA) are proposed to improve the RL convergence time. Algorithms are evaluated using an advanced, system-level simulator of an M-MIMO Heterogeneous Network (HetNet) utilizing an accurate 3D-ray-tracing radio channel model. The proposed RL-based BSs switching algorithm is proven to provide 70% gains in EE over a state-of-the-art algorithm using an analytical heuristic. Moreover, the proposed action filtering and REM-EA can reduce RL convergence time in relation to the best-performing state-of-the-art exploration method by 60% and 83%, respectively.

翻译：高能效(EE)是十分重要的,而考虑到大规模多投入多产出多输出(M-MIMO)网络,基础站配备了由数百个元素组成的天线阵列。M-MIMO传输尽管光谱效率很高,但随着天线数量的增加,导致能源消耗增加。本文调查EEE通过开关/关闭未充分利用的BS来改进EE的改善。建议使用基于位置认知的方法,将关于最佳活性BS数据集的数据存储在无线电环境地图(REM)中。为了高效率地获取、处理和利用REM数据,使用了由数百个元素组成的天线阵列学习(RLL)算法。最新技术的探索/开发方法包括e-greedy、高信任库(UCB)和梯度山。然后分析行动过滤,并提议以REM为基础的Explit Algorithm (REM-EA) 来改进RL的趋同时间调时间。为了高效获取、系统级平级的REM数据,正在使用高级系统级的RED-Deal-Trading Ral-Develyal-Devely Ral 3-Lal-Remal-Lal-L