Visible-infrared person re-identification (VI-ReID) has been challenging due to the existence of large discrepancies between visible and infrared modalities. Most pioneering approaches reduce intra-class variations and inter-modality discrepancies by learning modality-shared and ID-related features. However, an explicit modality-shared cue, i.e., body keypoints, has not been fully exploited in VI-ReID. Additionally, existing feature learning paradigms imposed constraints on either global features or partitioned feature stripes, which neglect the prediction consistency of global and part features. To address the above problems, we exploit Pose Estimation as an auxiliary learning task to assist the VI-ReID task in an end-to-end framework. By jointly training these two tasks in a mutually beneficial manner, our model learns higher quality modality-shared and ID-related features. On top of it, the learnings of global features and local features are seamlessly synchronized by Hierarchical Feature Constraint (HFC), where the former supervises the latter using the knowledge distillation strategy. Experimental results on two benchmark VI-ReID datasets show that the proposed method consistently improves state-of-the-art methods by significant margins. Specifically, our method achieves nearly 20$\%$ mAP improvements against the state-of-the-art method on the RegDB dataset. Our intriguing findings highlight the usage of auxiliary task learning in VI-ReID.
翻译:由于可见和红外模式之间存在巨大差异,可见的红外线人再识别(VI-REID)一直具有挑战性,因为可见和红外模式之间存在巨大差异,大多数开创性做法通过学习模式共享和ID相关特征,减少类内差异和模式间差异;然而,VI-REID没有充分利用明确模式共享的提示,即机构键点;此外,现有的特征学习模式对全球特征或分割式特征条块施加了限制,忽视了全球和部分特征的预测一致性。为了解决上述问题,我们利用Pose Estimation作为辅助学习任务,协助在端对端框架内的VI-REID任务。通过以互利的方式共同培训这两项任务,我们的模型学习了更高质量的模式共享和与身份相关特征。此外,全球特征和本地特征的学习是紧密同步的,而前者则利用知识淡化战略对后者进行监督。在对VI-REID-REID-REID-REID-Rial Studhal的两次基准值基六-REID-ReID-real-retal-degs the Groaddal-legs the Stat-real-real-degal-reget the Stat-real-degal Stat-real-degal-degalddal-degal-degs