Recent 2D-to-3D human pose estimation works tend to utilize the graph structure formed by the topology of the human skeleton. However, we argue that this skeletal topology is too sparse to reflect the body structure and suffer from serious 2D-to-3D ambiguity problem. To overcome these weaknesses, we propose a novel graph convolution network architecture, Hierarchical Graph Networks (HGN). It is based on denser graph topology generated by our multi-scale graph structure building strategy, thus providing more delicate geometric information. The proposed architecture contains three sparse-to-fine representation subnetworks organized in parallel, in which multi-scale graph-structured features are processed and exchange information through a novel feature fusion strategy, leading to rich hierarchical representations. We also introduce a 3D coarse mesh constraint to further boost detail-related feature learning. Extensive experiments demonstrate that our HGN achieves the state-of-the art performance with reduced network parameters. Code is released at https://github.com/qingshi9974/BMVC2021-Hierarchical-Graph-Networks-for-3D-Human-Pose-Estimation.
翻译:近期的二维到三维人体姿态估计工作通常利用人体骨架的拓扑结构形成的图结构。然而,我们认为这种骨架拓扑结构过于稀疏,无法反映身体结构,存在严重的二维到三维歧义问题。为克服这些弱点,我们提出了一种新的图卷积网络架构,即层次图网络(Hierarchical Graph Networks,HGN)。它基于我们的多尺度图结构构建策略生成的更密集的图拓扑结构,因此提供了更精细的几何信息。所提出的架构包含三个平行的稀疏到精细表示子网络,在这些子网络中,多尺度图结构特征被处理并通过一种新颖的特征融合策略进行交换,从而得到丰富的层次化表示。我们还引入了三维粗网格约束,以进一步提高与细节相关的特征学习。广泛的实验证明,我们的HGN以减少网络参数的方式实现了最先进的性能。代码已经发布在https://github.com/qingshi9974/BMVC2021-Hierarchical-Graph-Networks-for-3D-Human-Pose-Estimation。