Deep reinforcement learning is widely used to train autonomous cars in a simulated environment. Still, autonomous cars are well known for being vulnerable when exposed to adversarial attacks. This raises the question of whether we can train the adversary as a driving agent for finding failure scenarios in autonomous cars, and then retrain autonomous cars with new adversarial inputs to improve their robustness. In this work, we first train and compare adversarial car policy on two custom reward functions to test the driving control decision of autonomous cars in a multi-agent setting. Second, we verify that adversarial examples can be used not only for finding unwanted autonomous driving behavior, but also for helping autonomous driving cars in improving their deep reinforcement learning policies. By using a high fidelity urban driving simulation environment and vision-based driving agents, we demonstrate that the autonomous cars retrained using the adversary player noticeably increase the performance of their driving policies in terms of reducing collision and offroad steering errors.
翻译:深度强化学习被广泛用于模拟环境下的自驾车培训。不过,自驾车在受到对抗性攻击时很脆弱,这引起了一个问题,即我们是否可以将对手训练成在自驾车中发现故障情景的驾驶员,然后用新的对抗性投入对自驾车进行再培训,以提高其稳健性。在这项工作中,我们首先培训和比较关于两个自驾车奖赏功能的对抗性汽车政策,以测试多试剂环境下自驾车的驾驶控制决定。第二,我们核实,自驾车的例子不仅可以用来寻找不需要的自驾行为,还可以用来帮助自驾车完善其深度强化学习政策。我们利用高度忠诚的城市驾驶模拟环境和基于视觉的驾驶员,我们证明,使用对手对自驾车进行再培训的自驾车在减少碰撞和偏离方向方向错误方面明显提高了驾驶政策的性能。