Decision-making is critical for lane change in autonomous driving. Reinforcement learning (RL) algorithms aim to identify the values of behaviors in various situations and thus they become a promising pathway to address the decision-making problem. However, poor runtime safety hinders RL-based decision-making strategies from complex driving tasks in practice. To address this problem, human demonstrations are incorporated into the RL-based decision-making strategy in this paper. Decisions made by human subjects in a driving simulator are treated as safe demonstrations, which are stored into the replay buffer and then utilized to enhance the training process of RL. A complex lane change task in an off-ramp scenario is established to examine the performance of the developed strategy. Simulation results suggest that human demonstrations can effectively improve the safety of decisions of RL. And the proposed strategy surpasses other existing learning-based decision-making strategies with respect to multiple driving performances.
翻译:强化学习算法旨在确定各种情况下行为的价值,从而成为解决决策问题的一个有希望的途径;然而,运行时安全性差妨碍了基于运行时间的决策战略,妨碍了在实践中复杂的驾驶任务;为解决这一问题,将人示范纳入本文件基于驾驶室的决策战略;将驾驶模拟器中人类主体做出的决定视为安全示范,存储在重放缓冲中,然后用于加强驾驶室的培训进程。 确定了在脱机情况下复杂的航道改变任务,以审查已制定的战略的绩效;模拟结果表明,人类示范可有效改善驾驶室决策的安全性。 拟议的战略超过了其他现有的关于多种驾驶表演的基于学习的决策战略。