KTN:学习多人2D-3D函授知识转让网络 (KTN: Knowledge Transfer Network for Learning Multi-person 2D-3D Correspondences)

Human densepose estimation, aiming at establishing dense correspondences between 2D pixels of human body and 3D human body template, is a key technique in enabling machines to have an understanding of people in images. It still poses several challenges due to practical scenarios where real-world scenes are complex and only partial annotations are available, leading to incompelete or false estimations. In this work, we present a novel framework to detect the densepose of multiple people in an image. The proposed method, which we refer to Knowledge Transfer Network (KTN), tackles two main problems: 1) how to refine image representation for alleviating incomplete estimations, and 2) how to reduce false estimation caused by the low-quality training labels (i.e., limited annotations and class-imbalance labels). Unlike existing works directly propagating the pyramidal features of regions for densepose estimation, the KTN uses a refinement of pyramidal representation, where it simultaneously maintains feature resolution and suppresses background pixels, and this strategy results in a substantial increase in accuracy. Moreover, the KTN enhances the ability of 3D based body parsing with external knowledges, where it casts 2D based body parsers trained from sufficient annotations as a 3D based body parser through a structural body knowledge graph. In this way, it significantly reduces the adverse effects caused by the low-quality annotations. The effectiveness of KTN is demonstrated by its superior performance to the state-of-the-art methods on DensePose-COCO dataset. Extensive ablation studies and experimental results on representative tasks (e.g., human body segmentation, human part segmentation and keypoints detection) and two popular densepose estimation pipelines (i.e., RCNN and fully-convolutional frameworks), further indicate the generalizability of the proposed method.

翻译：人类密度估计旨在建立人体2D像素和3D人体模板之间的密集对应关系,这是使机器能够理解图像中的人的一种关键技术。由于现实世界场景复杂,只有部分注释可用,因此它仍构成若干挑战。在这项工作中,我们提出了一个新框架,用以检测图像中多重人的密度。我们称之为知识传输网络(KTN)的拟议方法解决两个主要问题:1)如何改进图像表达方式,以缓解不完全的估算,2)如何减少低质量培训标签(即有限的说明和类平衡标签)造成的错误估计。与直接传播高密度估计区域金字塔特征的现有工作不同,KTN使用了一种精细化的金字塔代表方式,同时保持特征分辨率分辨率分辨率,压制背景像素,这一策略导致更精确性。KTN(KTN)提高了基于外部知识的3D结构分析的能力,它通过以2Oality数据结构分析方式, 使得以充分的数值分析结果通过2D结构结构分析结果, 通过以充分的组织结构分析方式, 降低了以2Oal-al-deal-deal-deal a resental resent intal restial-deal ex ex ex ex exal ex dal deal deal deal deal deal deal deal deal deal ex ex ex ex ex ex ex ex dal ex ex dal ex ex ex ex ex ex ex intal intalutus ex ex ex ex extial deal demodududududuutal ex ex ex exal deal ex ex ex ex ex ex ex ex dal dal dal dal dal deal dal dal dal dal dal dal daldaldal ex ex exal exal exal ex ex exaldaldal exal ex ex a ex ex ex ex ex ex ex ex ex a ex ex ex a ex ex ex ex ex ex ex ex ex ex a ex a ex a ex a ex a ex