This is an official pytorch implementation of Deep High-Resolution Representation Learning for Human Pose Estimation. In this work, we are interested in the human pose estimation problem with a focus on learning reliable high-resolution representations. Most existing methods recover high-resolution representations from low-resolution representations produced by a high-to-low resolution network. Instead, our proposed network maintains high-resolution representations through the whole process. We start from a high-resolution subnetwork as the first stage, gradually add high-to-low resolution subnetworks one by one to form more stages, and connect the mutli-resolution subnetworks in parallel. We conduct repeated multi-scale fusions such that each of the high-to-low resolution representations receives information from other parallel representations over and over, leading to rich high-resolution representations. As a result, the predicted keypoint heatmap is potentially more accurate and spatially more precise. We empirically demonstrate the effectiveness of our network through the superior pose estimation results over two benchmark datasets: the COCO keypoint detection dataset and the MPII Human Pose dataset. The code and models have been publicly available at \url{https://github.com/leoxiaobin/deep-high-resolution-net.pytorch}.
翻译:这是人类粒子估测深高分辨率代表制学习的正式方案。 在这项工作中,我们关心人类构成的估计问题,重点是学习可靠的高分辨率代表制。大多数现有方法都从高到低分辨率代表制产生的低分辨率代表制中恢复高分辨率代表制。相反,我们提议的网络在整个过程中保持高分辨率代表制。我们从高分辨率子网络开始,第一阶段是高分辨率子网络,逐步将高到低分辨率子网络各增加一个,形成更多阶段,并同时连接混血分辨率子网络。我们反复进行多级混合,使每个高到低分辨率代表制从其他平行代表制中反复获得信息,从而导致丰富的高分辨率代表制。因此,预测的关键点热映法有可能更加准确和空间上更为精确。我们从经验上通过两个基准数据集(COCO关键点检测数据集和MPII 人类颗粒子数据集)来展示我们的网络有效性:COCOCO关键点检测数据集和MPII 人类显像数据集。我们进行多次进行多次的多层次融合,以使每个高到低分辨率代表制分辨率代表制其他平行代表制,从其他平行的平行代表制成,并导致丰富的高分辨率代表制高分辨率代表制。结果,预测关键点热点和模型/高射/高射/高分辨率/高射/高射/高分辨率模型/高射/高射/高射/高射/高射/高射/高分辨率/高射/高射/光/光/光/光/光/光/光/光/光/光/光/光/深模型在公开提供。