Let us rethink the real-world scenarios that require human motion prediction techniques, such as human-robot collaboration. Current works simplify the task of predicting human motions into a one-off process of forecasting a short future sequence (usually no longer than 1 second) based on a historical observed one. However, such simplification may fail to meet practical needs due to the neglect of the fact that motion prediction in real applications is not an isolated ``observe then predict'' unit, but a consecutive process composed of many rounds of such unit, semi-overlapped along the entire sequence. As time goes on, the predicted part of previous round has its corresponding ground truth observable in the new round, but their deviation in-between is neither exploited nor able to be captured by existing isolated learning fashion. In this paper, we propose DeFeeNet, a simple yet effective network that can be added on existing one-off prediction models to realize deviation perception and feedback when applied to consecutive motion prediction task. At each prediction round, the deviation generated by previous unit is first encoded by our DeFeeNet, and then incorporated into the existing predictor to enable a deviation-aware prediction manner, which, for the first time, allows for information transmit across adjacent prediction units. We design two versions of DeFeeNet as MLP-based and GRU-based, respectively. On Human3.6M and more complicated BABEL, experimental results indicate that our proposed network improves consecutive human motion prediction performance regardless of the basic model.
翻译:让我们重新思考需要人体运动预测技术的实际场景,例如人机协作。目前的作品将预测人体动作的任务简化为一次性地基于历史观察到的动作序列预测短期未来序列(通常不超过1秒)。然而,这样的简化可能无法满足实际需求,因为实际应用中的运动预测不是一个孤立的“观察然后预测”的单元,而是一个由许多轮此类单元组成的连续过程,沿着整个序列半重叠。随着时间的推移,前一轮预测部分产生的偏差在新一轮中其相应的真实值是可观测的,但它们之间的偏差既没有被充分利用也不能被现有的孤立学习模式所捕获。在本文中,我们提出了DeFeeNet,这是一个简单而有效的网络,可以添加到现有的单次预测模型上,以在连续运动预测任务中实现偏差感知和反馈。在每个预测轮次,我们的 DeFeeNet 首先将前一个单元产生的偏差进行编码,然后将其合并到现有的预测器中,以启用带有偏差感知的预测方式,这将首次允许邻近的预测单元之间进行信息传递。我们设计了两个版本的 DeFeeNet,即基于 MLP 和 GRU 的版本。在 Human3.6M 和更复杂的 BABEL 上,实验结果表明,我们所提出的网络改进了连续人体运动预测性能,无论其基本模型如何。