Predicting future locations of agents in the scene is an important problem in self-driving. In recent years, there has been a significant progress in representing the scene and the agents in it. The interactions of agents with the scene and with each other are typically modeled with a Graph Neural Network. However, the graph structure is mostly static and fails to represent the temporal changes in highly dynamic scenes. In this work, we propose a temporal graph representation to better capture the dynamics in traffic scenes. We complement our representation with two types of memory modules; one focusing on the agent of interest and the other on the entire scene. This allows us to learn temporally-aware representations that can achieve good results even with simple regression of multiple futures. When combined with goal-conditioned prediction, we show better results that can reach the state-of-the-art performance on the Argoverse benchmark.
翻译:预测现场物剂的未来位置是自我驱动的一个重要问题。近年来,在代表现场和其中物剂方面取得了显著进展。 代理人与现场和彼此之间的互动通常以图形神经网络为模型。 然而,图表结构大多是静态的,不能代表高度动态的场景的时间变化。 在这项工作中,我们提出了一个时间图表示方式,以更好地捕捉交通场景的动态。 我们用两种记忆模块来补充我们的代表形式:一种是侧重于感兴趣的物剂,另一种是整个场景中的物剂。这使我们能够了解时间意识的表示方式,既可以简单回归多个未来也能够取得良好结果。当与以目标为条件的预测相结合时,我们展示出更好的结果,可以达到阿尔戈弗基准上最先进的表现。