Owing to the unique advantages of low cost and controllability, reconfigurable intelligent surface (RIS) is a promising candidate to address the blockage issue in millimeter wave (mmWave) communication systems, consequently has captured widespread attention in recent years. However, the joint active beamforming and passive beamforming design is an arduous task due to the high computational complexity and the dynamic changes of wireless environment. In this paper, we consider a RIS-assisted multi-user multiple-input single-output (MU-MISO) mmWave system and aim to develop a deep reinforcement learning (DRL) based algorithm to jointly design active hybrid beamformer at the base station (BS) side and passive beamformer at the RIS side. By employing an advanced soft actor-critic (SAC) algorithm, we propose a maximum entropy based DRL algorithm, which can explore more stochastic policies than deterministic policy, to design active analog precoder and passive beamformer simultaneously. Then, the digital precoder is determined by minimum mean square error (MMSE) method. The experimental results demonstrate that our proposed SAC algorithm can achieve better performance compared with conventional optimization algorithm and DRL algorithm.
翻译:由于低成本和可控性的独特优势,可重新配置的智能表面(RIS)是解决毫米波(mmWave)通信系统中阻塞问题的有希望的候选者,因此近年来引起了广泛的注意。然而,由于计算复杂度高和无线环境的动态变化,联合活性波束成形和被动波束成形设计是一项艰巨的任务。在本文件中,我们考虑一种由RIS协助的多用户多用途多功能投入单输出(MU-MISO)毫米Wave系统,目的是开发一种基于深度强化学习(DRL)的算法,以联合设计基地站(BS)侧的活性混合光谱仪和在RIS侧的被动光谱。通过使用先进的软性动作-立体(SAC)算法,我们提议了一种基于最大恒温基DRL算法,这种算法可以探索比威慑性政策更多的随机学政策,以便同时设计积极的模拟和被动光谱化。然后,数字前解算法可以由最低限度的平均格式错误(MMSE)方法来更好地确定。实验性结果显示我们提议的SACADRAL的常规演算法的性。