Motion prediction (MP) of multiple agents is a crucial task in arbitrarily complex environments, from social robots to self-driving cars. Current approaches tackle this problem using end-to-end networks, where the input data is usually a rendered top-view of the scene and the past trajectories of all the agents; leveraging this information is a must to obtain optimal performance. In that sense, a reliable Autonomous Driving (AD) system must produce reasonable predictions on time, however, despite many of these approaches use simple ConvNets and LSTMs, models might not be efficient enough for real-time applications when using both sources of information (map and trajectory history). Moreover, the performance of these models highly depends on the amount of training data, which can be expensive (particularly the annotated HD maps). In this work, we explore how to achieve competitive performance on the Argoverse 1.0 Benchmark using efficient attention-based models, which take as input the past trajectories and map-based features from minimal map information to ensure efficient and reliable MP. These features represent interpretable information as the driveable area and plausible goal points, in opposition to black-box CNN-based methods for map processing.
翻译:从社会机器人到自行驾驶汽车等任意复杂的环境中,多种物剂的流动预测(MP)是一项关键任务。目前的方法是利用端到端网络来解决这个问题,输入数据通常是对现场和所有物剂过去轨迹的一览;利用这一信息是取得最佳性能所必需的。从这个意义上讲,一个可靠的自动驾驶(AD)系统必须及时作出合理的预测,尽管其中许多方法使用简单的ConvNets和LSTMs,但模型在使用两个信息来源(地图和轨迹历史)时可能不够有效,以便实时应用。此外,这些模型的性能在很大程度上取决于培训数据的数量,这些数据可能非常昂贵(特别是附加说明的HD地图 )。在这项工作中,我们探索如何利用高效的以关注为基础的模型在Argovers 1.0基准上取得竞争性的性能,这些模型将以往的轨迹和基于地图的特征从最低限度的地图信息中输入,以确保高效和可靠的MP。这些特征代表了可解释的信息,作为可驱动区域和直观的目标点,用于黑箱处理的地图处理。