Long term human motion prediction is essential in safety-critical applications such as human-robot interaction and autonomous driving. In this paper, we show that, to achieve long term forecasting, predicting human pose at every time instant is unnecessary. Instead, it is more effective to predict a few keyposes and approximate intermediate ones by linearly interpolating the keyposes. We will demonstrate that our approach enables us to predict realistic motions for up to 5 seconds in the future, which is far larger than the typical 1 second encountered in the literature. Over this extended time period, our predictions are more realistic and better preserve the motion dynamics than those state-of-the-art methods yield. Furthermore, because we model future keyposes probabilistically, we can generate multiple plausible future motions by sampling at inference time. This is useful to model because people usually can do one of several things given what they have already done.
翻译:长期人类运动预测对于人类机器人相互作用和自主驱动等安全关键应用至关重要。 在本文中,我们表明,为了实现长期预测,没有必要在每次时时刻刻都预测人体姿势。 相反,通过直线插图来预测几个关键因素和近似中间因素比较有效。 我们将表明,我们的方法使我们能够预测未来多达5秒钟的现实动作,远远大于文献中常见的1秒。 在这段较长的时期内,我们的预测比这些最先进的方法产生的结果更现实,更能保存运动动态。此外,因为我们未来模拟关键因素是概率性的,我们可以通过推断时间取样产生多种可信的未来运动。这对模拟非常有用,因为人们通常可以做一些他们已经做过的事情之一。