Recent monocular human performance capture approaches have shown compelling dense tracking results of the full body from a single RGB camera. However, existing methods either do not estimate clothing at all or model cloth deformation with simple geometric priors instead of taking into account the underlying physical principles. This leads to noticeable artifacts in their reconstructions, e.g. baked-in wrinkles, implausible deformations that seemingly defy gravity, and intersections between cloth and body. To address these problems, we propose a person-specific, learning-based method that integrates a simulation layer into the training process to provide for the first time physics supervision in the context of weakly supervised deep monocular human performance capture. We show how integrating physics into the training process improves the learned cloth deformations, allows modeling clothing as a separate piece of geometry, and largely reduces cloth-body intersections. Relying only on weak 2D multi-view supervision during training, our approach leads to a significant improvement over current state-of-the-art methods and is thus a clear step towards realistic monocular capture of the entire deforming surface of a clothed human.
翻译:最近的单体人类性能捕捉方法显示,用一个RGB相机对全身进行惊人的密集跟踪结果,然而,现有的方法要么根本不估计衣服的全貌,要么用简单的几何前科来进行模型布变形,而不是考虑到基本物理原理。这导致在重建过程中出现显著的文物,例如烤成的皱纹、看似无视重力的难以相信的变形,以及布与身之间的交汇。为了解决这些问题,我们建议采用一种个人特有的、基于学习的方法,将模拟层纳入培训过程,以便首次在受到严密监督的深度单人性性性能捕获过程中进行物理监督。我们展示了将物理纳入培训过程如何改进学习的布变形,使制成的服装成为单独的几何形状,并在很大程度上缩小了布体的交叉点。我们的方法仅依靠弱的2D多视角监督,就能够大大改进当前最先进的方法,从而明确迈出了一步,以现实的单面方式捕捉到布衣人的整个变形表面。