There has been a recent surge of interest in introducing transformers to 3D human pose estimation (HPE) due to their powerful capabilities in modeling long-term dependencies. However, existing transformer-based methods treat body joints as equally important inputs and ignore the prior knowledge of human skeleton topology in the self-attention mechanism. To tackle this issue, in this paper, we propose a Pose-Oriented Transformer (POT) with uncertainty guided refinement for 3D HPE. Specifically, we first develop novel pose-oriented self-attention mechanism and distance-related position embedding for POT to explicitly exploit the human skeleton topology. The pose-oriented self-attention mechanism explicitly models the topological interactions between body joints, whereas the distance-related position embedding encodes the distance of joints to the root joint to distinguish groups of joints with different difficulties in regression. Furthermore, we present an Uncertainty-Guided Refinement Network (UGRN) to refine pose predictions from POT, especially for the difficult joints, by considering the estimated uncertainty of each joint with uncertainty-guided sampling strategy and self-attention mechanism. Extensive experiments demonstrate that our method significantly outperforms the state-of-the-art methods with reduced model parameters on 3D HPE benchmarks such as Human3.6M and MPI-INF-3DHP
翻译:最近,由于变压器在模拟长期依赖性方面的强大能力,人们对将变压器引入3D人造外貌估计(HPE)的兴趣激增。然而,现有的变压器法将身体连接作为同等重要的投入处理,忽视了自留机制中人类骨骼地形学的先前知识。为了解决这一问题,我们在本文件中提议采用一个带有三D人造外貌调整指导的不确定型变压器(POT),对3D人造外貌变压进行精细化。具体地说,我们首先为POT开发新的面向外貌的自我注意机制和远程相关位置,以明确利用人体骨骼表层学。面向面的自我注意机制明确模拟了身体连接的表面相互作用,而将连接到根基联合之间的距离,以区分在回归过程中有不同困难的一组联合。此外,我们提出了一个不确定式调整调整网(UGRN),以完善POT的预测,特别是困难的组合,方法是考虑每种联合估计的不确定性,同时以不确定式-制导式-3M(HPD)的模型和自我定位机制展示了我们关于人力资源管理的模型的模型的精确方法。