In this work, we propose a self-improving artificial intelligence system for enhancing the safety performance of reinforcement learning (RL) based autonomous driving (AD) agents based on black-box verification methods. RL methods have enjoyed popularity among AD applications in recent years. That being said, existing RL algorithms' performance strongly depends on the diversity of training scenarios. Lack of safety-critical scenarios in the training phase might lead to poor generalization performance in real-world driving applications. We propose a novel framework, where the weaknesses of the training set are explored via black-box verification methods. After the discovery of AD failure scenarios, the training of the RL agent is re-initiated to improve the performance of the previously unsafe scenarios. Simulation results show that the proposed approach efficiently discovers such safety failures in RL-based adaptive cruise control (ACC) applications and significantly reduces the number of vehicle collisions through iterative applications of our method.
翻译:在这项工作中,我们提议建立自我改进人工智能系统,以基于黑盒核查方法的强化学习自动驾驶(AD)剂的安全性能,近年来,RL方法在AD应用程序中受到欢迎,也就是说,现有RL算法的性能在很大程度上取决于培训情景的多样性,培训阶段缺乏安全临界情景可能会导致现实世界驾驶应用程序的概括性工作表现不佳。我们提议了一个新颖的框架,通过黑盒核查方法来探讨培训成套的弱点。在发现ADD失败情形后,重新启动对RL代理的培训,以改善先前不安全情形的性能。模拟结果表明,拟议的方法有效地发现了基于RL的适应性巡航控制(ACC)应用中的安全性故障,并通过我们方法的迭接应用大大减少了车辆碰撞的次数。