Compared to joint position, the accuracy of joint rotation and shape estimation has received relatively little attention in the skinned multi-person linear model (SMPL)-based human mesh reconstruction from multi-view images. The work in this field is broadly classified into two categories. The first approach performs joint estimation and then produces SMPL parameters by fitting SMPL to resultant joints. The second approach regresses SMPL parameters directly from the input images through a convolutional neural network (CNN)-based model. However, these approaches suffer from the lack of information for resolving the ambiguity of joint rotation and shape reconstruction and the difficulty of network learning. To solve the aforementioned problems, we propose a two-stage method. The proposed method first estimates the coordinates of mesh vertices through a CNN-based model from input images, and acquires SMPL parameters by fitting the SMPL model to the estimated vertices. Estimated mesh vertices provide sufficient information for determining joint rotation and shape, and are easier to learn than SMPL parameters. According to experiments using Human3.6M and MPI-INF-3DHP datasets, the proposed method significantly outperforms the previous works in terms of joint rotation and shape estimation, and achieves competitive performance in terms of joint location estimation.
翻译:与联合位置相比,联合轮换和形状估计的准确性在基于多视图图像的皮肤多人线性模型(SMPL)的人类网状重建中相对很少受到注意。这一领域的工作大致分为两类:第一种方法是进行联合估算,然后通过将SMPL与随后的接合点相配而产生SMPL参数;第二种方法是通过一个以动态神经网络为基础的模型,直接从输入图像中倒退SMPL参数;然而,这些方法缺乏解决联合轮换和形状重建的模糊性以及网络学习困难的信息。为了解决上述问题,我们建议了两阶段方法。拟议方法首先通过基于CNN的输入图像模型估计网状脊椎的坐标,然后通过将SMPL模型与估计的螺旋模型相匹配,获取SMPL参数。估计Mes 网状顶点为确定联合轮换和形状,并且比SMPL参数更容易学习。根据人类3.6MP-INF-3DHP数据转换的实验,我们提出了一种两阶段方法。拟议的方法首先通过基于输入图像的模型,从先前的竞争性位置上得出了联合测算方法。