Let us rethink the real-world scenarios that require human motion prediction techniques, such as human-robot collaboration. Current works simplify the task of predicting human motions into a one-off process of forecasting a short future sequence (usually no longer than 1 second) based on a historical observed one. However, such simplification may fail to meet practical needs due to the neglect of the fact that motion prediction in real applications is not an isolated ``observe then predict'' unit, but a consecutive process composed of many rounds of such unit, semi-overlapped along the entire sequence. As time goes on, the predicted part of previous round has its corresponding ground truth observable in the new round, but their deviation in-between is neither exploited nor able to be captured by existing isolated learning fashion. In this paper, we propose DeFeeNet, a simple yet effective network that can be added on existing one-off prediction models to realize deviation perception and feedback when applied to consecutive motion prediction task. At each prediction round, the deviation generated by previous unit is first encoded by our DeFeeNet, and then incorporated into the existing predictor to enable a deviation-aware prediction manner, which, for the first time, allows for information transmit across adjacent prediction units. We design two versions of DeFeeNet as MLP-based and GRU-based, respectively. On Human3.6M and more complicated BABEL, experimental results indicate that our proposed network improves consecutive human motion prediction performance regardless of the basic model.
翻译:让我们重新思考需要人类运动预测技术的实际场景,例如人机协作。当前的工作将预测人类运动简化为预测历史观察序列的短期未来序列(通常不超过1秒)的过程。然而,这种简化可能无法满足实际需求,因为它忽略了实际应用中运动预测不是一个孤立的“观察然后预测”的单位,而是由许多轮这样的单位组成,沿整个序列半重叠。随着时间的推移,先前回合的预测部分在新回合中具有其对应的地面真实观测值,但它们之间的偏差既没有被利用也不能被现有的孤立学习方式捕捉到。在本文中,我们提出了DeFeeNet,这是一种简单而有效的网络,可以添加到现有的一次性预测模型中,在应用于连续运动预测任务时实现偏差感知和反馈。在每个预测回合中,先前单元生成的偏差首先由我们的DeFeeNet编码,然后纳入现有的预测器,以实现一种偏差感知预测方式,这使得信息可以在相邻的预测单元之间传输,这是首次发生的。我们设计了基于MLP和GRU的两个版本的DeFeeNet。在Human3.6M和更复杂的BABEL上,实验结果表明,我们提出的网络改善了连续人体运动预测性能,无论基本模型如何。