深入加强学习的大规模海事组织网络资源调度表</s> (A Deep Reinforcement Learning-Based Resource Scheduler for Massive MIMO Networks)

The large number of antennas in massive MIMO systems allows the base station to communicate with multiple users at the same time and frequency resource with multi-user beamforming. However, highly correlated user channels could drastically impede the spectral efficiency that multi-user beamforming can achieve. As such, it is critical for the base station to schedule a suitable group of users in each transmission interval to achieve maximum spectral efficiency while adhering to fairness constraints among the users. User scheduling is an NP-hard problem, with complexity growing exponentially with the number of users. In this paper, we consider the user scheduling problem for massive MIMO systems. Inspired by recent achievements in deep reinforcement learning (DRL) to solve problems with large action sets, we propose \name{}, a dynamic scheduler for massive MIMO based on the state-of-the-art Soft Actor-Critic (SAC) DRL model and the K-Nearest Neighbors (KNN) algorithm. Through comprehensive simulations using realistic massive MIMO channel models as well as real-world datasets from channel measurement experiments, we demonstrate the effectiveness of our proposed model in various channel conditions. Our results show that our proposed model performs very close to the optimal proportionally fair (PF) scheduler in terms of spectral efficiency and fairness with more than one order of magnitude lower computational complexity in medium network sizes where PF is computationally feasible. Our results also show the feasibility and high performance of our proposed scheduler in networks with a large number of users.

翻译：大型MIMO系统中的天线数量庞大,使得基础站能够与多个用户同时和以多用户波束成形的频率资源与多个用户进行通信。然而,高度关联的用户渠道可能会极大地阻碍多用户波束成形所能达到的光谱效率。因此,基地站必须在每个传输间隔内安排一组合适的用户,以便在用户中遵守公平限制的情况下实现最大光谱效率。用户排期是一个难以解决的难题, 其复杂性随着用户人数的增加而成倍增长。在本文中,我们考虑到大型MIMO系统的用户排期问题。深强化学习(DRL)的最近成就, 可能会极大地妨碍多用户波束成形能够解决问题的光谱效率。因此,我们建议根据最先进的Soft Actor-Crit(SAC) DRL(SAC)模型和 K-Nearest Neighbors(KNNN) 算法, 通过使用现实的大型MIMO频道频道模型的全面模拟, 以及来自频道测量实验的真实世界数据集, 我们用最高级强化的系统网络的进度, 展示了我们拟议中最精确的进度, 和最精确的进度的进度, 显示我们最接近的进度, 最精确的进度的进度, 显示我们最精确的进度, 和最精确的进度的进度的进度, 显示我们最接近的进度的进度的进度, 显示我们最接近的进度的进度的进度, 和最接近的轨道的进度, 显示我们最接近的轨道的进度的进度的进度的进度的进度, 显示我们最接近的进度, 和最短的轨道的轨道的进度的进度, 显示我们最短的进度的进度的进度, 显示我们最接近的进度, 显示我们最接近的轨道的进度的进度的进度的进度的进度的进度的进度的进度的进度的进度的进度的进度的进度的进度, 。</s>