Decision-making strategy for autonomous vehicles de-scribes a sequence of driving maneuvers to achieve a certain navigational mission. This paper utilizes the deep reinforcement learning (DRL) method to address the continuous-horizon decision-making problem on the highway. First, the vehicle kinematics and driving scenario on the freeway are introduced. The running objective of the ego automated vehicle is to execute an efficient and smooth policy without collision. Then, the particular algorithm named proximal policy optimization (PPO)-enhanced DRL is illustrated. To overcome the challenges in tardy training efficiency and sample inefficiency, this applied algorithm could realize high learning efficiency and excellent control performance. Finally, the PPO-DRL-based decision-making strategy is estimated from multiple perspectives, including the optimality, learning efficiency, and adaptability. Its potential for online application is discussed by applying it to similar driving scenarios.
翻译:自主车辆的决策战略取消了为实现某种导航任务而采取的一系列驾驶动作。本文件利用深度强化学习(DRL)方法来解决高速公路上连续视距的决策问题。首先,引入了高速公路上的车辆运动和驾驶方案。自我自动车的运行目标是在不发生碰撞的情况下执行高效和平稳的政策。然后,演示了称为快速政策优化(PPPO)的增强的DRL的特殊算法。为了克服延迟培训效率和抽样效率低下的挑战,这种应用算法可以实现高学习效率和极佳的控制性业绩。最后,基于PPO-DRL的决策战略是从多种角度估计的,包括最佳性、学习效率和适应性。它在线应用的潜力通过将其应用到类似的驾驶方案加以讨论。