Previous 3D human pose and mesh estimation methods mostly rely on only global image feature to predict 3D rotations of human joints (i.e., 3D rotational pose) from an input image. However, local features on the position of human joints (i.e., positional pose) can provide joint-specific information, which is essential to understand human articulation. To effectively utilize both local and global features, we present Pose2Pose, a 3D positional pose-guided 3D rotational pose prediction network, along with a positional pose-guided pooling and joint-specific graph convolution. The positional pose-guided pooling extracts useful joint-specific local and global features. Also, the joint-specific graph convolution effectively processes the joint-specific features by learning joint-specific characteristics and different relationships between different joints. We use Pose2Pose for expressive 3D human pose and mesh estimation and show that it outperforms all previous part-specific and expressive methods by a large margin. The codes will be publicly available.
翻译:先前的 3D 人类姿势和网形估计方法大多仅依靠全球图像特征来预测输入图像中的3D 人关节旋转( 3D 旋转姿势) 。 然而, 人类关节位置的局部特征( 定位姿势) 可以提供共同的信息, 这对理解人类的表达至关重要 。 为了有效利用本地和全球特征, 我们展示了 Pose2Pose, 3D 定位姿势引导 3D 旋转姿势预测网络, 以及定位姿势引导组合和联合图形组合 。 定位姿势导形集合提取了有用的本地和全球共同特征 。 此外, 联合图形组合通过学习联合特定特征和不同关节间不同关系, 有效地处理联合特定特征 。 我们使用 Pose2Pose 来表达 3D 人的姿势和网形估计, 并显示它以大幅度比以往所有 特定 和 表达方式 。 代码将公开提供 。