Although the performance of 3D human pose and shape estimation methods has improved significantly in recent years, existing approaches typically generate 3D poses defined in camera or human-centered coordinate system. This makes it difficult to estimate a person's pure pose and motion in world coordinate system for a video captured using a moving camera. To address this issue, this paper presents a camera motion agnostic approach for predicting 3D human pose and mesh defined in the world coordinate system. The core idea of the proposed approach is to estimate the difference between two adjacent global poses (i.e., global motion) that is invariant to selecting the coordinate system, instead of the global pose coupled to the camera motion. To this end, we propose a network based on bidirectional gated recurrent units (GRUs) that predicts the global motion sequence from the local pose sequence consisting of relative rotations of joints called global motion regressor (GMR). We use 3DPW and synthetic datasets, which are constructed in a moving-camera environment, for evaluation. We conduct extensive experiments and prove the effectiveness of the proposed method empirically. Code and datasets are available at https://github.com/seonghyunkim1212/GMR
翻译:尽管3D人形和形状估计方法的性能近年来有了显著的改善,但现有方法通常会产生在摄像或以人为中心的坐标系统中定义的3D人形,因此很难估计一个人在世界协调系统中使用移动相机拍摄的视频的纯面貌和运动。为解决这一问题,本文件提出了一个照相机运动不可知性方法,用于预测3D人形和形状在世界协调系统中定义的网格。拟议方法的核心思想是估计两种相邻全球面面面貌(即全球运动)之间的差异,两种相邻全球面貌(即全球运动)无法选择协调系统,而不是选择与摄影机运动相结合的全球组合。为此,我们提议建立一个基于双向闭门的经常性单元(GRUs)的网络,预测全球运动顺序,即由称为全球运动倒退器(GMRR)的组合相对旋转组成。我们使用3DPW和合成数据集来进行评估。我们进行了广泛的实验,并证明了拟议方法在实证上的有效性。在 http://ghembr/gymbrmmmmass。