Multi-pedestrian trajectory prediction is an indispensable element of autonomous systems that safely interact with crowds in unstructured environments. Many recent efforts in trajectory prediction algorithms have focused on understanding social norms behind pedestrian motions. Yet we observe these works usually hold two assumptions, which prevent them from being smoothly applied to robot applications: (1) positions of all pedestrians are consistently tracked, and (2) the target agent pays attention to all pedestrians in the scene. The first assumption leads to biased interaction modeling with incomplete pedestrian data. The second assumption introduces aggregation of redundant surrounding information, and the target agent may be affected by unimportant neighbors or present overly conservative motion. Thus, we propose Gumbel Social Transformer, in which an Edge Gumbel Selector samples a sparse interaction graph of partially detected pedestrians at each time step. A Node Transformer Encoder and a Masked LSTM encode pedestrian features with sampled sparse graphs to predict trajectories. We demonstrate that our model overcomes potential problems caused by the aforementioned assumptions, and our approach outperforms related works in trajectory prediction benchmarks. Code is available at \url{https://github.com/tedhuang96/gst}.
翻译:多行人轨迹预测是自主系统的一个不可或缺的要素,这种系统与无结构环境中的人群安全地互动。最近许多关于轨迹预测算法的努力都侧重于了解行人运动背后的社会规范。然而,我们观察这些工程通常持有两个假设,这些假设使机器人应用无法顺利应用:(1)所有行人的位置得到一致跟踪,(2)目标物剂关注现场所有行人。第一个假设导致与不完整行人数据进行有偏颇的模拟互动。第二个假设是合并多余的周围信息,目标物剂可能受到不重要的邻居的影响,或者受到过于保守的动作的影响。因此,我们提议Gumbel社会变换器,其中Edge Gumbel选择器对每个步骤部分被探测到的行人进行微小的互动图进行取样。节点变变变码器和蒙版LSTM编码的行人特征,带有抽样稀疏图,以预测轨迹。我们证明我们的模型克服了上述假设造成的潜在问题,我们的方法在轨迹预测基准方面超出了我们的方法。代码可在url/https://github./hung.96。