Integrated sensing and communication (ISAC) technology is essential for enabling the vehicular networks. However, the communication channel in this scenario exhibits time-varying characteristics, and the potential targets may move rapidly, creating a doubly-dynamic phenomenon. This nature poses a challenge for real-time precoder design. While optimization-based solutions are widely researched, they are complex and heavily rely on perfect prior information, which is impractical in double dynamics. To address this challenge, we propose using constrained deep reinforcement learning (CDRL) to facilitate dynamic updates to the ISAC precoder design. Additionally, the primal dual-deep deterministic policy gradient (PD-DDPG) and Wolpertinger architecture are tailored to efficiently train the algorithm under complex constraints and variable numbers of users. The proposed scheme not only adapts to the dynamics based on observations but also leverages environmental information to enhance performance and reduce complexity. Its superiority over existing candidates has been validated through experiments.
翻译:暂无翻译