Driving in dense traffic with human and autonomous drivers is a challenging task that requires high level planning and reasoning along with the ability to react quickly to changes in a dynamic environment. In this study, we propose a hierarchical learning approach that uses learned motion primitives as actions. Motion primitives are obtained using unsupervised skill discovery without a predetermined reward function, allowing them to be reused in different scenarios. This can reduce the total training time for applications that need to obtain multiple models with varying behavior. Simulation results demonstrate that the proposed approach yields driver models that achieve higher performance with less training compared to baseline reinforcement learning methods.
翻译:与人和自主驾驶者进行密集交通,是一项具有挑战性的任务,需要高层次的规划和推理以及快速应对动态环境变化的能力。在本研究中,我们提出一个等级学习方法,使用学习运动原始体作为行动。运动原始体是在没有预先设定的奖励功能的情况下利用未经监督的技能发现来获得的,允许在不同的情景下再利用。这可以减少需要获得多种行为不同的模型的应用所需的全部培训时间。模拟结果表明,拟议的方法产生比基线强化学习方法少培训的驱动模型,其性能要好得多。