Prior work has looked at applying reinforcement learning and imitation learning approaches to autonomous driving scenarios, but either the safety or the efficiency of the algorithm is compromised. With the use of control barrier functions embedded into the reinforcement learning policy, we arrive at safe policies to optimize the performance of the autonomous driving vehicle. However, control barrier functions need a good approximation of the model of the car. We use probabilistic control barrier functions as an estimate of the model uncertainty. The algorithm is implemented as an online version in the CARLA (Dosovitskiy et al., 2017) Simulator and as an offline version on a dataset extracted from the NGSIM Database. The proposed algorithm is not just a safe ramp merging algorithm but a safe autonomous driving algorithm applied to address ramp merging on highways.
翻译:先前的工作已经研究过对自主驾驶方案应用强化学习和模仿学习方法的问题,但算法的安全性或效率都受到影响。随着使用强化学习政策所包含的控制屏障功能,我们制定了优化自主驾驶车辆性能的安全政策。然而,控制屏障功能需要与汽车型号相近。我们用概率控制屏障功能来估计模型的不确定性。算法作为CARLA(Dosovitskiy等人,2017年)的在线版本在CARLA(Dosovitskiy等人,2017年)中实施,模拟器和从NGSIM数据库提取的数据集的离线版本。拟议的算法不仅仅是安全坡道合并算法,而是用于解决高速公路坡道合并的安全自主驱动算法。