In this paper, we introduce the first learning-based planner to drive a car in dense, urban traffic using Inverse Reinforcement Learning (IRL). Our planner, DriveIRL, generates a diverse set of trajectory proposals, filters these trajectories with a lightweight and interpretable safety filter, and then uses a learned model to score each remaining trajectory. The best trajectory is then tracked by the low-level controller of our self-driving vehicle. We train our trajectory scoring model on a 500+ hour real-world dataset of expert driving demonstrations in Las Vegas within the maximum entropy IRL framework. DriveIRL's benefits include: a simple design due to only learning the trajectory scoring function, relatively interpretable features, and strong real-world performance. We validated DriveIRL on the Las Vegas Strip and demonstrated fully autonomous driving in heavy traffic, including scenarios involving cut-ins, abrupt braking by the lead vehicle, and hotel pickup/dropoff zones. Our dataset will be made public to help further research in this area.
翻译:在本文中,我们介绍了第一个使用反强化学习(IRL)驾驶密集城市交通车辆的基于学习的计划师。我们的规划师LiveIRL提出一套不同的轨迹建议,用轻量和可解释的安全过滤器过滤这些轨迹,然后用一个学习的模型来评分每一个剩余轨迹。最佳轨迹随后由我们自行驾驶的车辆的低层控制员跟踪。我们用500小时以上在拉斯维加斯的专家驾驶示威实际数据集培训我们的轨迹评分模型,该模型在最大摄氏IRL框架内进行。DriverIRL的好处包括:一个简单的设计,因为只学习轨迹评分功能、相对可解释的特征和强大的真实世界性表现。我们验证了拉斯维加斯地带的驱动仪,并展示了在重型交通中完全自主的驾驶,包括切入、铅车突然刹车和旅馆接车/卸货区。我们的数据集将公之于众,以帮助进行这方面的进一步研究。