Designing a safe and human-like decision-making system for an autonomous vehicle is a challenging task. Generative imitation learning is one possible approach for automating policy-building by leveraging both real-world and simulated decisions. Previous work that applies generative imitation learning to autonomous driving policies focuses on learning a low-level controller for simple settings. However, to scale to complex settings, many autonomous driving systems combine fixed, safe, optimization-based low-level controllers with high-level decision-making logic that selects the appropriate task and associated controller. In this paper, we attempt to bridge this gap in complexity by employing Safety-Aware Hierarchical Adversarial Imitation Learning (SHAIL), a method for learning a high-level policy that selects from a set of low-level controller instances in a way that imitates low-level driving data on-policy. We introduce an urban roundabout simulator that controls non-ego vehicles using real data from the Interaction dataset. We then demonstrate empirically that even with simple controller options, our approach can produce better behavior than previous approaches in driver imitation that have difficulty scaling to complex environments. Our implementation is available at https://github.com/sisl/InteractionImitation.
翻译:为自主车辆设计一个安全和人性化的决策系统是一项艰巨的任务。 想象性学习是通过利用现实世界和模拟决定使政策建设自动化的一种可能的方法。 将基因模仿学习应用于自主驾驶政策的以往工作重点是学习用于简单环境的低级别控制器。 但是,为了向复杂的环境发展,许多自主驾驶系统将固定、安全、优化的低级别控制器与选择适当任务和相关控制器的高级决策逻辑结合起来。 在本文中,我们试图通过使用安全- Aware Histrictical Aversarial Limitation(SHAIL) 学习一种从一组低级别控制器中选择的高级政策的方法来弥合这一复杂性的差距。 我们引入一个城市环绕的模拟器,用互动数据集中的真实数据控制非驾驶器。 我们随后从经验上表明,即使使用简单的控制器选项,我们的方法也能在以往的驱动器模仿/移动器中产生比以往更好的行为方式,难以向复杂的环境缩放。 我们的应用程序是 http/ httpsalgistruction/ commitation。