Autonomous Vehicles (AVs) are required to operate safely and efficiently in dynamic environments. For this, the AVs equipped with Joint Radar-Communications (JRC) functions can enhance the driving safety by utilizing both radar detection and data communication functions. However, optimizing the performance of the AV system with two different functions under uncertainty and dynamic of surrounding environments is very challenging. In this work, we first propose an intelligent optimization framework based on the Markov Decision Process (MDP) to help the AV make optimal decisions in selecting JRC operation functions under the dynamic and uncertainty of the surrounding environment. We then develop an effective learning algorithm leveraging recent advances of deep reinforcement learning techniques to find the optimal policy for the AV without requiring any prior information about surrounding environment. Furthermore, to make our proposed framework more scalable, we develop a Transfer Learning (TL) mechanism that enables the AV to leverage valuable experiences for accelerating the training process when it moves to a new environment. Extensive simulations show that the proposed transferable deep reinforcement learning framework reduces the obstacle miss detection probability by the AV up to 67% compared to other conventional deep reinforcement learning approaches.
 翻译:为此,配备了联合雷达通信功能的自动飞行器可以通过利用雷达探测和数据通信功能加强驾驶安全。然而,在周围环境的不确定性和动态下,优化具有两种不同功能的自动飞行器系统的运作非常困难。在这项工作中,我们首先根据Markov决定程序(MDP)提出一个智能优化框架,以帮助自动飞行器根据周围环境的动态和不确定性在选择JRC操作功能时做出最佳决定。然后,我们开发一个有效的学习算法,利用深度强化学习技术的最新进展,为AV找到最佳政策,而无需事先提供关于周围环境的任何信息。此外,为了使拟议框架更加可扩展,我们开发了一个转移学习机制,使AV能够在进入新环境时利用宝贵的经验加快培训进程。广泛的模拟表明,拟议的可转移深度强化学习框架将AV的失密概率降低到67%,而其他常规的深度强化学习方法则将减少障碍。