用于 3D 人类网格登记的地方认知小片变形场 (Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration)

Registering point clouds of dressed humans to parametric human models is a challenging task in computer vision. Traditional approaches often rely on heavily engineered pipelines that require accurate manual initialization of human poses and tedious post-processing. More recently, learning-based methods are proposed in hope to automate this process. We observe that pose initialization is key to accurate registration but existing methods often fail to provide accurate pose initialization. One major obstacle is that, regressing joint rotations from point clouds or images of humans is still very challenging. To this end, we propose novel piecewise transformation fields (PTF), a set of functions that learn 3D translation vectors to map any query point in posed space to its correspond position in rest-pose space. We combine PTF with multi-class occupancy networks, obtaining a novel learning-based framework that learns to simultaneously predict shape and per-point correspondences between the posed space and the canonical space for clothed human. Our key insight is that the translation vector for each query point can be effectively estimated using the point-aligned local features; consequently, rigid per bone transformations and joint rotations can be obtained efficiently via a least-square fitting given the estimated point correspondences, circumventing the challenging task of directly regressing joint rotations from neural networks. Furthermore, the proposed PTF facilitate canonicalized occupancy estimation, which greatly improves generalization capability and results in more accurate surface reconstruction with only half of the parameters compared with the state-of-the-art. Both qualitative and quantitative studies show that fitting parametric models with poses initialized by our network results in much better registration quality, especially for extreme poses.

翻译：在计算机视野中,一个艰巨的任务就是将穿戴服装的人的云点登记为模拟人类模型。传统方法往往依赖大量设计的管道,这些管道需要准确人工初始化人的容貌和枯燥的后处理。最近,提出了学习型方法,希望使这一过程自动化。我们观察到,初始化是准确登记的关键,但现有方法往往无法提供准确的初始化。一个主要障碍是,从点云或人类图像中减少联合旋转仍然非常具有挑战性。为此,我们提议了新的纸质转换场(PTF),这是一套功能,学习三维翻译矢量,以绘制空间中的任何查询点与休息空间的对应位置。最近,我们提出了学习型基于学习型方法,希望使这一过程自动化。我们发现,初始初始化是一个学习型框架,可以同时预测构成空间与衣着人类的罐体空间之间的形状和每点通信空间之间的对应。我们的主要见解是,每个查询点的翻译矢量可以通过点与本地特征进行有效估算;因此,每个骨质转换和联合旋转型矢量的矢量,通过最精确的陆基化的网络进行更高效的运行,从而快速地显示最有挑战性地更新的地面定位,因此,在最精确的地面定位中,通过最精确的网络进行最精确的运行的深度的优化的深度的运行状态上,只能测测测测测测测测测测得更能更能更能更能更更更更更更更能更能,, 。