We present the results of our autonomous racing virtual challenge, based on the newly-released Learn-to-Race (L2R) simulation framework, which seeks to encourage interdisciplinary research in autonomous driving and to help advance the state of the art on a realistic benchmark. Analogous to racing being used to test cutting-edge vehicles, we envision autonomous racing to serve as a particularly challenging proving ground for autonomous agents as: (i) they need to make sub-second, safety-critical decisions in a complex, fast-changing environment; and (ii) both perception and control must be robust to distribution shifts, novel road features, and unseen obstacles. Thus, the main goal of the challenge is to evaluate the joint safety, performance, and generalisation capabilities of reinforcement learning agents on multi-modal perception, through a two-stage process. In the first stage of the challenge, we evaluate an autonomous agent's ability to drive as fast as possible, while adhering to safety constraints. In the second stage, we additionally require the agent to adapt to an unseen racetrack through safe exploration. In this paper, we describe the new L2R Task 2.0 benchmark, with refined metrics and baseline approaches. We also provide an overview of deployment, evaluation, and rankings for the inaugural instance of the L2R Autonomous Racing Virtual Challenge (supported by Carnegie Mellon University, Arrival Ltd., AICrowd, Amazon Web Services, and Honda Research), which officially used the new L2R Task 2.0 benchmark and received over 20,100 views, 437 active participants, 46 teams, and 733 model submissions -- from 88+ unique institutions, in 58+ different countries. Finally, we release leaderboard results from the challenge and provide description of the two top-ranking approaches in cross-domain model transfer, across multiple sensor configurations and simulated races.
翻译:我们以新推出的 " 学习到竞赛 " (L2R)模拟框架为基础,展示了我们自主赛跑虚拟挑战的结果,该模拟框架旨在鼓励对自主驾驶进行跨学科研究,并帮助在现实的基准上提高最新水平。对赛车用于测试尖端车辆的模拟,我们设想了自主赛,以作为自主剂特别具有挑战性的证明基础,如:(一)它们需要在复杂、快速变化的环境中作出次二级的、对安全的至关重要的决定;以及(二)对于分销转移、新的道路特征和无形障碍,两种观点和控制都必须强有力。因此,挑战的主要目标是通过两阶段进程,评估多模式2的强化学习剂的联合安全、性能和通用能力。 在挑战的第一阶段,我们评估自主赛车的能力,以尽可能快的速度驱动,同时遵守安全限制。 在模型阶段,我们还要求该代理人通过安全探索,适应看不见的跨轨道,我们描述了新的L2R任务2.0基准, 高标准交付, 高标准 和高标准4 虚拟系统参与者的升级的升级版本。