We present the results of our autonomous racing virtual challenge, based on the newly-released Learn-to-Race (L2R) simulation framework, which seeks to encourage interdisciplinary research in autonomous driving and to help advance the state of the art on a realistic benchmark. Analogous to racing being used to test cutting-edge vehicles, we envision autonomous racing to serve as a particularly challenging proving ground for autonomous agents as: (i) they need to make sub-second, safety-critical decisions in a complex, fast-changing environment; and (ii) both perception and control must be robust to distribution shifts, novel road features, and unseen obstacles. Thus, the main goal of the challenge is to evaluate the joint safety, performance, and generalisation capabilities of reinforcement learning agents on multi-modal perception, through a two-stage process. In the first stage of the challenge, we evaluate an autonomous agent's ability to drive as fast as possible, while adhering to safety constraints. In the second stage, we additionally require the agent to adapt to an unseen racetrack through safe exploration. In this paper, we describe the new L2R Task 2.0 benchmark, with refined metrics and baseline approaches. We also provide an overview of deployment, evaluation, and rankings for the inaugural instance of the L2R Autonomous Racing Virtual Challenge (supported by Carnegie Mellon University, Arrival Ltd., AICrowd, Amazon Web Services, and Honda Research), which officially used the new L2R Task 2.0 benchmark and received over 20,100 views, 437 active participants, 46 teams, and 733 model submissions -- from 88 unique institutions, in 28 different countries. Finally, we release leaderboard results from the challenge and provide description of the two top-ranking approaches in cross-domain model transfer, across multiple sensor configurations and simulated races.
翻译:我们根据新推出的 " 从学习到学习 " (L2R)模拟框架,展示了我们自主赛跑虚拟挑战的结果,该框架旨在鼓励对自主驾驶进行跨学科研究,并帮助在现实的基准上提高最新水平。与用来测试尖端车辆的竞赛相比,我们设想了自主赛,以作为自主剂特别具有挑战性的证明基础,如:(一)它们需要在一个复杂、快速变化的环境中做出次级的、对安全的至关重要的决定;以及(二)两种观念和控制都必须对分销转移、新的道路特征和无形障碍都强有力。因此,挑战的主要目标是通过两阶段进程,评估多模式感知强化学习剂的联合安全、性能和通用能力。 在挑战的第一阶段,我们评估自主赛能以尽可能快的速度驱动,同时遵守安全限制。 在第二阶段,我们还要求该代理人通过安全探索,跨透明赛道,我们描述新的L2R任务 2.0传感器基准,同时根据高标准、高标准、高标准、高标准标准、高标准、高标准标准、高标准机构进行在线评估。