Reinforcement learning (RL) is a promising solution for autonomous vehicles to deal with complex and uncertain traffic environments. The RL training process is however expensive, unsafe, and time consuming. Algorithms are often developed first in simulation and then transferred to the real world, leading to a common sim2real challenge that performance decreases when the domain changes. In this paper, we propose a transfer learning process to minimize the gap by exploiting digital twin technology, relying on a systematic and simultaneous combination of virtual and real world data coming from vehicle dynamics and traffic scenarios. The model and testing environment are evolved from model, hardware to vehicle in the loop and proving ground testing stages, similar to standard development cycle in automotive industry. In particular, we also integrate other transfer learning techniques such as domain randomization and adaptation in each stage. The simulation and real data are gradually incorporated to accelerate and make the transfer learning process more robust. The proposed RL methodology is applied to develop a path following steering controller for an autonomous electric vehicle. After learning and deploying the real-time RL control policy on the vehicle, we obtained satisfactory and safe control performance already from the first deployment, demonstrating the advantages of the proposed digital twin based learning process.
翻译:强化学习(RL)是处理复杂和不确定交通环境的自主车辆的一个很有希望的解决方案。RL培训过程虽然昂贵、不安全和耗时,但费用昂贵、不安全和耗时。等级通常首先在模拟中开发,然后转移到现实世界,从而导致一个共同的模拟和现实挑战,即当域变化时性能下降。在本文件中,我们提议一个转移学习过程,通过利用数字双胞胎技术,利用来自车辆动态和交通情景的虚拟和真实世界数据的系统和同步组合,最大限度地缩小差距。模型和测试环境从模型、硬件到车辆循环和验证地面测试阶段,类似于汽车工业的标准开发周期。特别是,我们还整合了其他转让学习技术,如域随机化和适应每个阶段。模拟和真实数据逐渐被整合,以加速和增强转移学习过程。拟议的RL方法用于开发一个跟踪自动电动车辆控制器指导控制器的路径。在学习和应用实时RL控制政策后,我们从首次部署中获得了满意和安全的控制性表现,展示了拟议数字双学习过程的优势。