This paper proposes a novel deep learning framework for multi-modal motion prediction. The framework consists of three parts: recurrent neural networks to process the target agent's motion process, convolutional neural networks to process the rasterized environment representation, and a distance-based attention mechanism to process the interactions among different agents. We validate the proposed framework on a large-scale real-world driving dataset, Waymo open motion dataset, and compare its performance against other methods on the standard testing benchmark. The qualitative results manifest that the predicted trajectories given by our model are accurate, diverse, and in accordance with the road structure. The quantitative results on the standard benchmark reveal that our model outperforms other baseline methods in terms of prediction accuracy and other evaluation metrics. The proposed framework is the second-place winner of the 2021 Waymo open dataset motion prediction challenge.
翻译:本文件提出了一个新的多模式运动预测深层次学习框架。框架由三部分组成:处理目标物剂运动过程的经常性神经网络;处理分层环境代表的进化神经网络;处理不同物剂之间相互作用的远程关注机制。我们验证了大规模真实世界驱动数据集的拟议框架Waymo开放运动数据集,并将其业绩与标准测试基准的其他方法进行比较。质量结果显示,我们模型给出的预测轨迹准确、多样,并符合道路结构。标准基准的定量结果显示,我们的模型在预测准确性和其他评估指标方面优于其他基线方法。拟议框架是2021 Waymo公开数据配置运动预测挑战的第二位赢家。