Autonomous vehicle (AV) stacks are typically built in a modular fashion, with explicit components performing detection, tracking, prediction, planning, control, etc. While modularity improves reusability, interpretability, and generalizability, it also suffers from compounding errors, information bottlenecks, and integration challenges. To overcome these challenges, a prominent approach is to convert the AV stack into an end-to-end neural network and train it with data. While such approaches have achieved impressive results, they typically lack interpretability and reusability, and they eschew principled analytical components, such as planning and control, in favor of deep neural networks. To enable the joint optimization of AV stacks while retaining modularity, we present DiffStack, a differentiable and modular stack for prediction, planning, and control. Crucially, our model-based planning and control algorithms leverage recent advancements in differentiable optimization to produce gradients, enabling optimization of upstream components, such as prediction, via backpropagation through planning and control. Our results on the nuScenes dataset indicate that end-to-end training with DiffStack yields substantial improvements in open-loop and closed-loop planning metrics by, e.g., learning to make fewer prediction errors that would affect planning. Beyond these immediate benefits, DiffStack opens up new opportunities for fully data-driven yet modular and interpretable AV architectures. Project website: https://sites.google.com/view/diffstack
翻译:自动车辆(AV)堆叠一般都是以模块方式建造的,有明确的部件进行探测、跟踪、预测、预测、规划、控制等,具有明确的部件进行探测、跟踪、预测、规划、控制等。模块化改善了AV堆叠的可重复性、可解释性和一般性,但也存在复杂的错误、信息瓶颈和整合挑战。为了克服这些挑战,一个突出的方法是将AV堆变成端到端的神经网络,并用数据对它进行培训。虽然这些方式取得了令人印象深刻的成果,但它们通常缺乏可解释性和可重复性,它们回避了原则性分析组成部分,如规划和控制,有利于深层神经网络。为了在保留模块性的同时实现AV堆堆的联合优化,我们提出了DiffStack,一个不同和模块化的堆叠,用于预测、规划和控制。值得注意的是,我们基于模型的规划和控制算算法利用最近的进展来产生梯度,使上游组件(例如预测、通过规划和控制进行回流流、通过规划和控制来调整)优化。我们在 NEScencom数据集上的结果表明,最终至端的模拟培训会影响D-dloveal-al-al-al-al-al-al-al-al-al-al-al-al-al-albook acolview、Stradings to