Previous work on predicting or generating 3D human pose sequences regresses either joint rotations or joint positions. The former strategy is prone to error accumulation along the kinematic chain, as well as discontinuities when using Euler angles or exponential maps as parameterizations. The latter requires re-projection onto skeleton constraints to avoid bone stretching and invalid configurations. This work addresses both limitations. QuaterNet represents rotations with quaternions and our loss function performs forward kinematics on a skeleton to penalize absolute position errors instead of angle errors. We investigate both recurrent and convolutional architectures and evaluate on short-term prediction and long-term generation. For the latter, our approach is qualitatively judged as realistic as recent neural strategies from the graphics literature. Our experiments compare quaternions to Euler angles as well as exponential maps and show that only a very short context is required to make reliable future predictions. Finally, we show that the standard evaluation protocol for Human3.6M produces high variance results and we propose a simple solution.
翻译:先前关于预测或生成 3D 人形序列的工作, 要么是联合旋转, 要么是联合位置。 前一项战略在使用 Euler 角度或指数地图作为参数化时, 容易在运动链上累积错误, 以及不连续。 后者需要重新投射到骨骼限制上, 以避免骨架伸展和无效配置。 这项工作解决了两者的局限性。 QuaterNet 代表着四重旋转, 我们的损失函数在骨架上表现前向运动, 以惩罚绝对位置错误, 而不是角度错误。 我们调查经常性和革命性结构, 并评估短期预测和长期代。 对于后者, 我们的方法被定性地判断为现实的, 与图形文献中最近的神经战略一样。 我们的实验将四重角度与 Euler 角度以及指数图进行比较, 并显示只需要非常短的背景来做出可靠的未来预测。 最后, 我们显示 Human 3. 6M 的标准评价协议产生高差异结果, 我们提出一个简单的解决办法 。