Forecasting the future behavior of all traffic agents in the vicinity is a key task to achieve safe and reliable autonomous driving systems. It is a challenging problem as agents adjust their behavior depending on their intentions, the others' actions, and the road layout. In this paper, we propose Decoder Fusion RNN (DF-RNN), a recurrent, attention-based approach for motion forecasting. Our network is composed of a recurrent behavior encoder, an inter-agent multi-headed attention module, and a context-aware decoder. We design a map encoder that embeds polyline segments, combines them to create a graph structure, and merges their relevant parts with the agents' embeddings. We fuse the encoded map information with further inter-agent interactions only inside the decoder and propose to use explicit training as a method to effectively utilize the information available. We demonstrate the efficacy of our method by testing it on the Argoverse motion forecasting dataset and show its state-of-the-art performance on the public benchmark.
翻译:预测附近所有交通代理商的未来行为是实现安全可靠的自主驾驶系统的关键任务。 当代理商根据自己的意图、其他人的行动和道路布局调整其行为时,这是一个具有挑战性的问题。 在本文中,我们提议使用一个经常性的、以关注为基础的运动预测方法Decoder Fusion RNN(DF-RNN),这是一个经常性的、关注为基础的方法。我们的网络由一个经常性的行为编码器、一个跨代理多头关注模块和一个环境识别解码器组成。 我们设计了一个包含多线段的地图编码器,将它们合并成一个图形结构,并将它们的相关部分与代理商的嵌入合并在一起。 我们把编码的地图信息与进一步的代理商间互动结合起来,并提议使用明确的培训作为有效利用现有信息的方法。 我们通过测试Argovers 动作预测数据集并展示其公共基准上的最新性能来展示我们的方法的有效性。