场景变形器:行为预测和规划的统一多任务模型 (Scene Transformer: A unified multi-task model for behavior prediction and planning)

Jiquan Ngiam,Benjamin Caine,Vijay Vasudevan,Zhengdong Zhang,Hao-Tien Lewis Chiang,Jeffrey Ling,Rebecca Roelofs,Alex Bewley,Chenxi Liu,Ashish Venugopal,David Weiss,Ben Sapp,Zhifeng Chen,Jonathon Shlens

Predicting the future motion of multiple agents is necessary for planning in dynamic environments. This task is challenging for autonomous driving since agents (e.g., vehicles and pedestrians) and their associated behaviors may be diverse and influence each other. Most prior work has focused on first predicting independent futures for each agent based on all past motion, and then planning against these independent predictions. However, planning against fixed predictions can suffer from the inability to represent the future interaction possibilities between different agents, leading to sub-optimal planning. In this work, we formulate a model for predicting the behavior of all agents jointly in real-world driving environments in a unified manner. Inspired by recent language modeling approaches, we use a masking strategy as the query to our model, enabling one to invoke a single model to predict agent behavior in many ways, such as potentially conditioned on the goal or full future trajectory of the autonomous vehicle or the behavior of other agents in the environment. Our model architecture fuses heterogeneous world state in a unified Transformer architecture by employing attention across road elements, agent interactions and time steps. We evaluate our approach on autonomous driving datasets for behavior prediction, and achieve state-of-the-art performance. Our work demonstrates that formulating the problem of behavior prediction in a unified architecture with a masking strategy may allow us to have a single model that can perform multiple motion prediction and planning related tasks effectively.

翻译：预测多个代理商的未来运动对于动态环境中的规划是必要的。这项任务对于自主驾驶来说具有挑战性,因为代理商(例如车辆和行人)及其相关行为可能各不相同,相互影响。大多数先前的工作都侧重于根据过去的所有运动,首先预测每个代理商的独立未来,然后根据这些独立预测进行规划。然而,固定预测的规划可能因无法代表不同代理商之间的未来互动可能性而受到影响,从而导致次优化规划。在这项工作中,我们制定了一个模型,以统一的方式预测现实世界驱动环境中所有代理商的行为。在近期语言模型方法的启发下,我们使用遮罩战略作为我们模型的查询,使一个人能够援引单一模型,以多种方式预测代理商行为,例如可能以目标为条件,或自主车辆未来轨迹或环境中其他代理商的行为而受到影响。我们的模型架构通过在各种道路要素、代理商互动和时间步骤上的关注,将不同的世界状态融合在一起。我们评估了自主驱动数据模型的方法,用于行为预测,我们的行为模型预测,我们能够援引单一模式的模型,并实现一个单一的预测战略。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/