Forecasting the trajectory of pedestrians in shared urban traffic environments is still considered one of the challenging problems facing the development of autonomous vehicles (AVs). In the literature, this problem is often tackled using recurrent neural networks (RNNs). Despite the powerful capabilities of RNNs in capturing the temporal dependency in the pedestrians' motion trajectories, they were argued to be challenged when dealing with longer sequential data. Thus, in this work, we are introducing a framework based on the transformer networks that were shown recently to be more efficient and outperformed RNNs in many sequential-based tasks. We relied on a fusion of the past positional information, agent interactions information and scene physical semantics information as an input to our framework in order to provide a robust trajectory prediction of pedestrians. We have evaluated our framework on two real-life datasets of pedestrians in shared urban traffic environments and it has outperformed the compared baseline approaches in both short-term and long-term prediction horizons.
翻译:在城市共同交通环境中行人轨迹的预测仍被视为发展自主车辆所面临的挑战性问题之一。在文献中,这个问题往往通过经常性神经网络来解决。尽管区域行人网在捕捉行人运动轨迹中时间依赖性的强大能力,但在处理较长的相继数据时却受到挑战。因此,在这项工作中,我们正在引入一个基于变压器网络的框架,最近显示,在很多顺序上的任务中,变压器网络的效率和性能都高于区域行人。我们依靠将以往定位信息、代理互动信息和现场物理语义信息融合在一起,作为我们框架的投入,以提供行人稳健的轨迹预测。我们评估了我们关于城市共同交通环境中行人的两个真实生活数据集的框架,并在短期和长期预测视野中超过了比较基准方法。