In this paper, we propose HOME, a framework tackling the motion forecasting problem with an image output representing the probability distribution of the agent's future location. This method allows for a simple architecture with classic convolution networks coupled with attention mechanism for agent interactions, and outputs an unconstrained 2D top-view representation of the agent's possible future. Based on this output, we design two methods to sample a finite set of agent's future locations. These methods allow us to control the optimization trade-off between miss rate and final displacement error for multiple modalities without having to retrain any part of the model. We apply our method to the Argoverse Motion Forecasting Benchmark and achieve 1st place on the online leaderboard.
翻译:在本文中,我们建议Home,这是一个解决运动预测问题的框架,其图像输出可以代表代理人未来地点的概率分布。这个方法可以建立一个简单的结构,由典型的变迁网络和对代理人互动的关注机制组成,并且输出一个不受限制的2D最高视野的代理人未来可能地点。根据这个结果,我们设计了两种方法来抽选一套有限的代理人未来地点。这些方法使我们能够在多种模式中控制误差率和最终迁移错误之间的最佳权衡,而不必再对模型的任何部分进行再培训。我们用我们的方法对Argoverstion 预测基准进行了应用,并在在线领导板上取得了第一位置。