Autonomous car racing is a major challenge in robotics. It raises fundamental problems for classical approaches such as planning minimum-time trajectories under uncertain dynamics and controlling the car at the limits of its handling. Besides, the requirement of minimizing the lap time, which is a sparse objective, and the difficulty of collecting training data from human experts have also hindered researchers from directly applying learning-based approaches to solve the problem. In the present work, we propose a learning-based system for autonomous car racing by leveraging a high-fidelity physical car simulation, a course-progress proxy reward, and deep reinforcement learning. We deploy our system in Gran Turismo Sport, a world-leading car simulator known for its realistic physics simulation of different race cars and tracks, which is even used to recruit human race car drivers. Our trained policy achieves autonomous racing performance that goes beyond what had been achieved so far by the built-in AI, and, at the same time, outperforms the fastest driver in a dataset of over 50,000 human players.
翻译:自主赛车是机器人面临的一项重大挑战,它给传统方法带来根本性问题,如在不确定的动态下规划最低时间轨迹和在车的操作限度内控制汽车等传统方法。 此外,要求尽量减少驾驶时间(这是一个稀少的目标)以及难以从人类专家收集培训数据也阻碍了研究人员直接应用基于学习的方法来解决这一问题。在目前的工作中,我们建议利用高不洁的汽车模拟、课程进展代理奖赏和深层强化学习等手段,为自主赛车建立一个学习系统。 我们在Gran Turismo Sport中安装了我们的系统,这是一个世界领先的汽车模拟器,它以对不同的种族汽车和轨道进行现实物理模拟而闻名,甚至被用于招聘人类的汽车驾驶员。我们经过培训的政策取得了超越人工智能迄今所实现的自主赛跑成绩,同时,在超过5万多人的数据集中超越了最快的驱动器。