The feasibility of collecting a large amount of expert demonstrations has inspired growing research interests in learning-to-drive settings, where models learn by imitating the driving behaviour from experts. However, exclusively relying on imitation can limit agents' generalisability to novel scenarios that are outside the support of the training data. In this paper, we address this challenge by factorising the driving task, based on the intuition that modular architectures are more generalisable and more robust to changes in the environment compared to monolithic, end-to-end frameworks. Specifically, we draw inspiration from the trajectory forecasting community and reformulate the learning-to-drive task as obstacle-aware perception and grounding, distribution-aware goal prediction, and model-based planning. Firstly, we train the obstacle-aware perception module to extract salient representation of the visual context. Then, we learn a multi-modal goal distribution by performing conditional density-estimation using normalising flow. Finally, we ground candidate trajectory predictions road geometry, and plan the actions based on on vehicle dynamics. Under the CARLA simulator, we report state-of-the-art results on the CARNOVEL benchmark.
翻译:收集大量专家演示的可行性激发了人们在学习到驾驶环境中越来越多的研究兴趣,在这些环境中,模型通过模仿专家的驾驶行为而学习。然而,完全依赖模仿可以将代理人的通用性限制在培训数据所不能支持的新情景上。在本文中,我们通过将驱动任务的因素化来应对这一挑战,其依据是:与单一的、终端到终端的框架相比,模块结构更加普遍,对于环境的变化更具活力。具体地说,我们从轨迹预测界中汲取灵感,重新将学习到驾驶的任务作为障碍觉悟的认知和定位、对分布的觉悟目标预测和基于模型的规划。首先,我们培训障碍觉觉察模块,以获得视觉环境的突出表现。然后,我们通过使用正常的流量进行有条件的密度估计,学习多模式目标分布。最后,我们地面候选人的轨迹预测对道路的几何性进行了预测,并规划基于车辆动态的行动。在CARLA模拟下,我们报告CARNOL的基准结果。