Human trajectory forecasting in crowds presents the challenges of modelling social interactions and outputting collision-free multimodal distribution. Following the success of Social Generative Adversarial Networks (SGAN), recent works propose various GAN-based designs to better model human motion in crowds. Despite superior performance in reducing distance-based metrics, current networks fail to output socially acceptable trajectories, as evidenced by high collisions in model predictions. To counter this, we introduce SGANv2: an improved safety-compliant SGAN architecture equipped with spatio-temporal interaction modelling and a transformer-based discriminator. The spatio-temporal modelling ability helps to learn the human social interactions better while the transformer-based discriminator design improves temporal sequence modelling. Additionally, SGANv2 utilizes the learned discriminator even at test-time via a collaborative sampling strategy that not only refines the colliding trajectories but also prevents mode collapse, a common phenomenon in GAN training. Through extensive experimentation on multiple real-world and synthetic datasets, we demonstrate the efficacy of SGANv2 to provide socially-compliant multimodal trajectories.
翻译:人群中的人类轨迹预测展示了模拟社会互动和输出无碰撞多式联运分布的挑战。在社会创能反反转网络(SGAN)成功之后,最近的工作提出了各种基于GAN的模型设计,以更好地模拟人群中的人类运动。尽管在减少远程计量方面表现优异,但目前的网络未能产生社会可接受的轨迹,如模型预测中的高碰撞所证明的那样。为了应对这一点,我们引入了SGANv2:一个符合安全的SGAN结构,该结构配备了空间-时际互动建模和一个基于变压器的制导师。在基于变压器的制型模型设计改进时间序列建模的同时,spatio-时际建模能力有助于更好地学习人类的社会互动。此外,SGANv2通过合作采样战略,不仅改进交错轨迹,而且还防止模式崩溃,这是GAN培训中的一种常见现象。通过对多个现实和合成数据集进行广泛的实验,我们展示了SGANv2的功效,以提供符合社会要求的多式联运。