全球适应遇上本地概括：无监督领域自适应用于三维人体姿态估计 (Global Adaptation meets Local Generalization: Unsupervised Domain Adaptation for 3D Human Pose Estimation)

When applying a pre-trained 2D-to-3D human pose lifting model to a target unseen dataset, large performance degradation is commonly encountered due to domain shift issues. We observe that the degradation is caused by two factors: 1) the large distribution gap over global positions of poses between the source and target datasets due to variant camera parameters and settings, and 2) the deficient diversity of local structures of poses in training. To this end, we combine \textbf{global adaptation} and \textbf{local generalization} in \textit{PoseDA}, a simple yet effective framework of unsupervised domain adaptation for 3D human pose estimation. Specifically, global adaptation aims to align global positions of poses from the source domain to the target domain with a proposed global position alignment (GPA) module. And local generalization is designed to enhance the diversity of 2D-3D pose mapping with a local pose augmentation (LPA) module. These modules bring significant performance improvement without introducing additional learnable parameters. In addition, we propose local pose augmentation (LPA) to enhance the diversity of 3D poses following an adversarial training scheme consisting of 1) a augmentation generator that generates the parameters of pre-defined pose transformations and 2) an anchor discriminator to ensure the reality and quality of the augmented data. Our approach can be applicable to almost all 2D-3D lifting models. \textit{PoseDA} achieves 61.3 mm of MPJPE on MPI-INF-3DHP under a cross-dataset evaluation setup, improving upon the previous state-of-the-art method by 10.2\%.

翻译：在将预训练的二维到三维人体姿态提升模型应用于未知目标数据集时，由于域漂移问题的存在，经常会出现性能大幅度降低的情况。我们观察到，性能降低是由两个因素引起的：1）源域和目标域之间在姿势的全局位置上存在巨大分布差异，这是由于不同的相机参数和设置导致的，和2）在训练时姿势的本地结构存在差异。因此，我们在 PoseDA 中将全球适应和本地概括结合起来，形成了一种简单而有效的三维人体姿态无监督领域适应框架。具体来说，全球适应旨在通过提出的全球位置对齐（GPA）模块将源主题的姿势的全局位置与目标主题对齐。而本地概括的目的是通过本地姿势增强（LPA）模块增强2D-3D姿势映射的多样性。这些模块在不引入额外可学习参数的情况下带来了显著的性能提升。此外，我们提出了本地姿势增强（LPA）来增强三维姿势的多样性，采用由促进器和锚点判别器组成的对抗训练方案生成预定义的姿势变换的参数以及保证增强数据的真实性和质量的锚点鉴别器。我们的方法几乎适用于所有2D到3D姿势提升模型。在交叉数据集评估设置下，PoseDA 在 MPI-INF-3DHP 上实现了61.3毫米的MPJPE，超过了先前最先进的方法10.2％。