Long term human motion prediction is essential in safety-critical applications such as human-robot interaction and autonomous driving. In this paper we show that to achieve long term forecasting, predicting human pose at every time instant is unnecessary. Instead, it is more effective to predict a few keyposes and approximate intermediate ones by linearly interpolating the keyposes. We will demonstrate that our approach enables us to predict realistic motions for up to 5 seconds in the future, which is far larger than the typical 1 second encountered in the literature. Furthermore, because we model future keyposes probabilistically, we can generate multiple plausible future motions by sampling at inference time. Over this extended time period, our predictions are more realistic, more diverse and better preserve the motion dynamics than those state-of-the-art methods yield.
翻译:长期人类运动预测对于人类机器人相互作用和自主驱动等安全关键应用至关重要。 在本文中,我们显示,为了实现长期预测,没有必要在每时每刻都预测人体姿势。 相反,通过线性插图来预测几个关键因素和大致中间因素比较有效。 我们将表明,我们的方法使我们能够预测未来多达5秒钟的现实动作,远远大于文献中典型的1秒。 此外,因为我们对未来关键因素进行概率性建模,我们可以在推断时通过取样产生多种可信的未来动作。 在这段较长的时期内,我们的预测更加现实,更加多样化,并且比那些最先进的方法产生效果更能保存运动动力。