We present MeshLeTemp, a powerful method for 3D human pose and mesh reconstruction from a single image. In terms of human body priors encoding, we propose using a learnable template human mesh instead of a constant template utilized by previous state-of-the-art methods. The proposed learnable template reflects not only vertex-vertex interactions but also the human pose and body shape, being able to adapt to diverse images. We also introduce a strategy to enrich the training data that contains both 2D and 3D annotations. We conduct extensive experiments to show the generalizability of our method and the effectiveness of our data strategy. As one of our ablation studies, we adapt MeshLeTemp to another domain which is 3D hand reconstruction.
翻译:我们展示了MeshLeTemp, 这是一种3D人姿势的强大方法,用单一图像进行网格重建。 在人体先前的编码方面,我们建议使用一个可学习的人类网格模板,而不是以往最先进方法使用的固定模板。 拟议的可学习模板不仅反映了顶点- 脊椎相互作用,还反映了人类的外形和身体形状,能够适应不同的图像。 我们还引入了一种战略,以丰富包含 2D 和 3D 注释的培训数据。 我们进行了广泛的实验,以显示我们的方法的可概括性和数据战略的有效性。 作为我们的研究之一,我们将MeshLeTemp 应用于另一个3D 手重建的领域。