MonoCloth Cap: 争取从单体 RGB 视频中抽取暂时一致的服装 (MonoClothCap: Towards Temporally Coherent Clothing Capture from Monocular RGB Video)

We present a method to capture temporally coherent dynamic clothing deformation from a monocular RGB video input. In contrast to the existing literature, our method does not require a pre-scanned personalized mesh template, and thus can be applied to in-the-wild videos. To constrain the output to a valid deformation space, we build statistical deformation models for three types of clothing: T-shirt, short pants and long pants. A differentiable renderer is utilized to align our captured shapes to the input frames by minimizing the difference in both silhouette, segmentation, and texture. We develop a UV texture growing method which expands the visible texture region of the clothing sequentially in order to minimize drift in deformation tracking. We also extract fine-grained wrinkle detail from the input videos by fitting the clothed surface to the normal maps estimated by a convolutional neural network. Our method produces temporally coherent reconstruction of body and clothing from monocular video. We demonstrate successful clothing capture results from a variety of challenging videos. Extensive quantitative experiments demonstrate the effectiveness of our method on metrics including body pose error and surface reconstruction error of the clothing.

翻译：我们提出了一个方法,从单镜 RGB 视频输入中捕捉到时间上一致的动态衣物变形。与现有的文献相比, 我们的方法不需要预先扫描的个性化网状模板, 因而可以应用到网上视频。为了将输出限制到一个有效的变形空间, 我们为三种类型的衣物构建了统计变形模型: T恤衫、短裤和长裤。我们用一个不同的翻版将我们所捕捉的形状与输入框相匹配, 最大限度地缩小胶片、分解和纹理的差别。我们开发了一种紫外线纹质增长方法, 扩大服装的可见纹理区域, 以便按顺序进行扩展, 以尽量减少变形跟踪中的漂移。我们还从输入视频中提取细细微的皱纹细节, 将布面与由革命神经网络估计的正常地图相配对。我们的方法通过单色视频对身体和服装进行时间一致的重建。我们展示了各种富有挑战性视频的成功的服装捕捉取结果。广泛的定量实验展示了我们测量度方法在测量度上的有效性, 包括身体错误和表面衣物面衣物重建错误。

相关内容