大规模MOMOM雷达多目标探测的强化学习方法 (A Reinforcement Learning based approach for Multi-target Detection in Massive MIMO radar)

This paper considers the problem of multi-target detection for massive multiple input multiple output (MMIMO) cognitive radar (CR). The concept of CR is based on the perception-action cycle that senses and intelligently adapts to the dynamic environment in order to optimally satisfy a specific mission. However, this usually requires a priori knowledge of the environmental model, which is not available in most cases. We propose a reinforcement learning (RL) based algorithm for cognitive multi-target detection in the presence of unknown disturbance statistics. The radar acts as an agent that continuously senses the unknown environment (i.e., targets and disturbance) and consequently optimizes transmitted waveforms in order to maximize the probability of detection ($P_\mathsf{D}$) by focusing the energy in specific range-angle cells (i.e., beamforming). Furthermore, we propose a solution to the beamforming optimization problem with less complexity than the existing methods. Numerical simulations are performed to assess the performance of the proposed RL-based algorithm in both stationary and dynamic environments. The RL based beamforming is compared to the conventional omnidirectional approach with equal power allocation and to adaptive beamforming with no RL. As highlighted by the proposed numerical results, our RL-based beamformer outperforms both approaches in terms of target detection performance. The performance improvement is even particularly remarkable under environmentally harsh conditions such as low SNR, heavy-tailed disturbance and rapidly changing scenarios.

翻译：本文探讨了大规模多重输入多重输出(MMIMIM)认知雷达(CR)的多目标探测问题。 CR的概念是基于感知-行动周期,它能感知并明智地适应动态环境,以最佳地满足特定任务。然而,这通常需要先验地了解环境模型,而多数情况下并不具备这种环境模型。我们提议了一种基于强化学习(RL)的算法,用于在出现未知扰动统计数据的情况下进行认知-多目标检测。雷达作为代理,不断感知未知环境(即目标和扰动),并因此优化传输波形,以便通过将能量集中在特定的射频形细胞(即成形)最大限度地增加探测的概率(P ⁇ mathsf{D}$)。此外,我们建议了一种解决方案,在基于未知扰动和动态环境中进行认知-多目标检测。以数字模拟方式评估拟议的基于RL的算法在固定和动态环境中的性能。基于RL的变形条件甚至精确的波形,是将快速探测概率的概率变化与在常规测算法下的变变变变变变变的变的Rim结果相比,这是特别的变的变的变式, 以平式的变式的变式的变式的变式方法是平式的变式的变式的变式的变式的变式的变式的变式的变式式的变式方法,以平式的变式的变式的变式的变式的变式的变式方法,以不同的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式方法,用法,以比的变式方法是不同的变式方法是不同的变式的变式的变式的变式的变式的变式的变式,用式,用式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式是的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式的变式,