Modern autonomous driving system is characterized as modular tasks in sequential order, i.e., perception, prediction, and planning. In order to perform a wide diversity of tasks and achieve advanced-level intelligence, contemporary approaches either deploy standalone models for individual tasks, or design a multi-task paradigm with separate heads. However, they might suffer from accumulative errors or deficient task coordination. Instead, we argue that a favorable framework should be devised and optimized in pursuit of the ultimate goal, i.e., planning of the self-driving car. Oriented at this, we revisit the key components within perception and prediction, and prioritize the tasks such that all these tasks contribute to planning. We introduce Unified Autonomous Driving (UniAD), a comprehensive framework up-to-date that incorporates full-stack driving tasks in one network. It is exquisitely devised to leverage advantages of each module, and provide complementary feature abstractions for agent interaction from a global perspective. Tasks are communicated with unified query interfaces to facilitate each other toward planning. We instantiate UniAD on the challenging nuScenes benchmark. With extensive ablations, the effectiveness of using such a philosophy is proven by substantially outperforming previous state-of-the-arts in all aspects. Code and models are public.
翻译:现代自动驾驶系统的特点是模块化任务按顺序执行,即感知、预测和规划。为了执行多样化的任务并实现高级水平的智能,当代方法要么为单个任务部署独立的模型,要么设计一个带有单独头部的多任务范式。但是,它们可能会受到累积误差或任务协调不足的影响。相反,我们认为应该设计和优化一个有利的框架,以追求自动驾驶汽车的最终目标,即规划。针对这一目标,我们重新审视感知和预测的关键组件,并优先处理这些任务,使得所有这些任务都有助于规划。我们介绍了Unified Autonomous Driving (UniAD), 一个最新的全面框架,它将全栈驾驶任务纳入一个网络中。它被精心设计以充分利用每个模块的优势,并为代理人交互提供全局视角的补充特征抽象。任务使用统一的查询接口进行通信,以便相互协助实现规划。我们在具有挑战性的NuScenes基准测试中实例化了UniAD。通过广泛的消融实验,使用这种哲学的有效性得到证明,UniAD在所有方面都明显优于以前的最新技术。代码和模型都是公开的。