Imitation learning (IL) has emerged as a central paradigm in autonomous driving. While IL excels in matching expert behavior in open-loop settings by minimizing per-step prediction errors, its performance degrades unexpectedly in closed-loop due to the gradual accumulation of small, often imperceptible errors over time.Over successive planning cycles, these errors compound, potentially resulting in severe failures.Current research efforts predominantly rely on increasingly sophisticated network architectures or high-fidelity training datasets to enhance the robustness of IL planners against error accumulation, focusing on the state-level robustness at a single time point. However, autonomous driving is inherently a continuous-time process, and leveraging the temporal scale to enhance robustness may provide a new perspective for addressing this issue.To this end, we propose a method termed Sequence of Experts (SoE), a temporal alternation policy that enhances closed-loop performance without increasing model size or data requirements. Our experiments on large-scale autonomous driving benchmarks nuPlan demonstrate that SoE method consistently and significantly improves the performance of all the evaluated models, and achieves state-of-the-art performance.This module may provide a key and widely applicable support for improving the training efficiency of autonomous driving models.
翻译:模仿学习已成为自动驾驶领域的核心范式。尽管模仿学习在开环场景下通过最小化单步预测误差能有效匹配专家行为,但在闭环场景中,由于微小且通常难以察觉的误差随时间逐渐累积,其性能会意外下降。在连续的规划周期中,这些误差不断叠加,可能导致严重的故障。当前研究主要依赖日益复杂的网络架构或高保真训练数据集来增强模仿学习规划器对抗误差累积的鲁棒性,聚焦于单一时刻的状态级鲁棒性。然而,自动驾驶本质上是一个连续时间过程,利用时间尺度提升鲁棒性可能为解决该问题提供新的视角。为此,我们提出了一种称为“专家序列”的方法,这是一种时序交替策略,可在不增加模型规模或数据需求的情况下提升闭环性能。我们在大规模自动驾驶基准nuPlan上的实验表明,SoE方法持续且显著地提升了所有评估模型的性能,并达到了最先进的性能水平。该模块可能为提升自动驾驶模型的训练效率提供关键且广泛适用的支持。