This paper defines the problem of optimizing the downlink multi-user multiple input, single output (MU-MISO) sum-rate for ground users served by an aerial reconfigurable intelligent surface (ARIS) that acts as a relay to the terrestrial base station. The deep deterministic policy gradient (DDPG) is proposed to calculate the optimal active beamforming matrix at the base station and the phase shifts of the reflecting elements at the ARIS to maximize the data rate. Simulation results show the superiority of the proposed scheme when compared to deep Q-learning (DQL) and baseline approaches.
翻译:本文界定了优化下行链路多用户多输入、单一输出(MU-MISO)和地面用户以空中可调整智能表面(ARIS)为服务对象,作为地面基地站的中继器的问题,建议采用深度确定性政策梯度(DDPG)来计算基站最佳主动波束矩阵,以及ARIS的反射元素的分阶段转移,以最大限度地提高数据率。模拟结果显示,与深Q学习(DQL)和基线方法相比,拟议方案优劣。