Employing large antenna arrays is a key characteristic of millimeter wave (mmWave) and terahertz communication systems. However, due to the adoption of fully analog or hybrid analog/digital architectures, as well as non-ideal hardware or arbitrary/unknown array geometries, the accurate channel state information becomes hard to acquire. This impedes the design of beamforming/combining vectors that are crucial to fully exploit the potential of large-scale antenna arrays in providing sufficient receive signal power. In this paper, we develop a novel framework that leverages deep reinforcement learning (DRL) and a Wolpertinger-variant architecture and learns how to iteratively optimize the beam pattern (shape) for serving one or a small set of users relying only on the receive power measurements and without requiring any explicit channel knowledge. The proposed model accounts for key hardware constraints such as the phase-only, constant-modulus, and quantized-angle constraints. Further, the proposed framework can efficiently optimize the beam patterns for systems with non-ideal hardware and for arrays with unknown or arbitrary array geometries. Simulation results show that the developed solution is capable of finding near-optimal beam patterns based only on the receive power measurements.
翻译:使用大型天线阵列是毫米波(mmWave)和千兆赫通信系统的关键特征,然而,由于采用了完全模拟或混合模拟/数字结构,以及非模拟或混合模拟/数字结构,以及非模拟或混合模拟/数字结构,或者任意/未知阵列的阵列地形,准确的信道状态信息很难获得。这妨碍了对充分利用大型天线阵列提供足够接收信号能量的潜力至关重要的大型天线阵列的潜力至关重要的束成形/组合矢量的设计。此外,在本文件中,我们开发了一个新的框架,利用了深加固学习(DRL)和Wolpertinger-变异体结构,并学习了如何对仅依靠接收功率测量的一小组用户或一小组用户的服务迭代优化波形图案(shape),而无需任何明确的频道知识。拟议的模型账户用于关键硬件制约,如只使用级、恒定模模模和四角矩形制约等。此外,拟议框架可以有效地优化非硬体硬件系统和有未知或任意阵列阵列地球阵列的阵列的阵列的阵列的阵列的阵列的阵列的阵列测量测量测量结果,只能只能只只能在定位上获得解决方案。