In this paper, we address the important problem in self-driving of forecasting multi-pedestrian motion and their shared scene occupancy map, critical for safe navigation. Our contributions are two-fold. First, we advocate for predicting both the individual motions as well as the scene occupancy map in order to effectively deal with missing detections caused by postprocessing, e.g., confidence thresholding and non-maximum suppression. Second, we propose a Scene-Actor Graph Neural Network (SA-GNN) which preserves the relative spatial information of pedestrians via 2D convolution, and captures the interactions among pedestrians within the same scene, including those that have not been detected, via message passing. On two large-scale real-world datasets, nuScenes and ATG4D, we showcase that our scene-occupancy predictions are more accurate and better calibrated than those from state-of-the-art motion forecasting methods, while also matching their performance in pedestrian motion forecasting metrics.
翻译:在本文中,我们讨论了自我驱动预测多速度运动及其共同场景占用图的重要问题,这对安全航行至关重要。我们的贡献有两重。首先,我们主张预测个人动作以及场景占用图,以便有效处理后处理造成的缺失检测,例如信任阈值和非最大抑制。第二,我们提议建立一个场景-动画神经网络(SA-GNN),通过2D Convolution保存行人相对空间信息,并捕捉同一场景行人之间的相互作用,包括通过信息传递尚未探测到的行人。在两个大型真实世界数据集、NuScenes和ATG4D上,我们展示我们的场景占用预测比州行人运动预测方法的预测更加准确和精确,同时匹配行人运动预测指标的性能。