How to explore corner cases as efficiently and thoroughly as possible has long been one of the top concerns in the context of deep reinforcement learning (DeepRL) autonomous driving. Training with simulated data is less costly and dangerous than utilizing real-world data, but the inconsistency of parameter distribution and the incorrect system modeling in simulators always lead to an inevitable Sim2real gap, which probably accounts for the underperformance in novel, anomalous and risky cases that simulators can hardly generate. Domain Randomization(DR) is a methodology that can bridge this gap with little or no real-world data. Consequently, in this research, an adversarial model is put forward to robustify DeepRL-based autonomous vehicles trained in simulation to gradually surfacing harder events, so that the models could readily transfer to the real world.
翻译:如何尽可能高效和彻底地探索角落型案例长期以来一直是深入强化学习(深RL)自主驾驶方面最令人关切的问题之一。模拟数据培训比使用真实世界数据成本低、危险性小,但参数分布不一致和模拟器模拟系统不正确,总是导致不可避免的Sim2Real差距,这可能是模拟器几乎无法生成的新颖、异常和风险案例表现不佳的原因。域随机化(DR)是一种可以用很少或没有真实世界数据来弥补这一差距的方法。因此,在这项研究中,提出了一种对抗模型,以强化在模拟方面受过训练的深RL型自主车辆,以逐步冲刷更严重事件,从而使模型能够很容易地转移到真实世界。