Multi-person total motion capture is extremely challenging when it comes to handle severe occlusions, different reconstruction granularities from body to face and hands, drastically changing observation scales and fast body movements. To overcome these challenges above, we contribute a lightweight total motion capture system for multi-person interactive scenarios using only sparse multi-view cameras. By contributing a novel hand and face bootstrapping algorithm, our method is capable of efficient localization and accurate association of the hands and faces even on severe occluded occasions. We leverage both pose regression and keypoints detection methods and further propose a unified two-stage parametric fitting method for achieving pixel-aligned accuracy. Moreover, for extremely self-occluded poses and close interactions, a novel feedback mechanism is proposed to propagate the pixel-aligned reconstructions into the next frame for more accurate association. Overall, we propose the first light-weight total capture system and achieves fast, robust and accurate multi-person total motion capture performance. The results and experiments show that our method achieves more accurate results than existing methods under sparse-view setups.
翻译:多人完全运动捕捉在应对严重隔离、从身体到手的不同重建粒子、急剧变化的观测尺度和快速身体运动等方面都极具挑战性。为了克服上述挑战,我们为多人互动情景贡献了一个轻量级总运动捕捉系统,仅使用稀少的多视相机。通过提供一种新的手法和面部踢踏算法,我们的方法能够使手部和面部在严重隐蔽的场合有效地定位和准确结合。我们利用了显示回归和关键点探测方法,并进一步提出了实现像素一致准确性的两阶段统一参数安装方法。此外,对于极为自我封闭的姿势和近距离互动,我们建议建立一个新的反馈机制,将像素调整的重建推广到下一个框架,以便更精确地联系起来。总体而言,我们建议建立第一个轻度总捕捉摸系统,并实现快速、稳健和准确的多人运动总捕捉的性能。结果和实验表明,我们的方法比稀有的组合下的现有方法取得更准确的结果。