Cloth-changing person reidentification (ReID) is a newly emerging research topic that aims to retrieve pedestrians whose clothes are changed. Since the human appearance with different clothes exhibits large variations, it is very difficult for existing approaches to extract discriminative and robust feature representations. Current works mainly focus on body shape or contour sketches, but the human semantic information and the potential consistency of pedestrian features before and after changing clothes are not fully explored or are ignored. To solve these issues, in this work, a novel semantic-aware attention and visual shielding network for cloth-changing person ReID (abbreviated as SAVS) is proposed where the key idea is to shield clues related to the appearance of clothes and only focus on visual semantic information that is not sensitive to view/posture changes. Specifically, a visual semantic encoder is first employed to locate the human body and clothing regions based on human semantic segmentation information. Then, a human semantic attention module (HSA) is proposed to highlight the human semantic information and reweight the visual feature map. In addition, a visual clothes shielding module (VCS) is also designed to extract a more robust feature representation for the cloth-changing task by covering the clothing regions and focusing the model on the visual semantic information unrelated to the clothes. Most importantly, these two modules are jointly explored in an end-to-end unified framework. Extensive experiments demonstrate that the proposed method can significantly outperform state-of-the-art methods, and more robust features can be extracted for cloth-changing persons. Compared with FSAM (published in CVPR 2021), this method can achieve improvements of 32.7% (16.5%) and 14.9% (-) on the LTCC and PRCC datasets in terms of mAP (rank-1), respectively.
翻译:换衣服的人的重新身份(ReID)是一个新出现的研究课题,目的是检索换衣服的人的行人。由于不同衣服的人的外观存在巨大的差异,因此很难利用现有的方法来提取歧视性和强势的特征表示。当前的工作主要侧重于身体形状或轮廓草图,但人类的语义信息以及换衣服前后行人特征的潜在一致性没有得到充分探索或忽视。为了解决这些问题,在这项工作中,为换衣服的人的ReID(以SAVS为标志)提出了一个新的语义认知关注和视觉屏蔽网络。由于以不同服装为标志的人的外观呈现出巨大的差异,因此,现有的方法很难提供与服装外观有关的线索,而只侧重于对观/变形变化不敏感的视觉语义信息。具体地说,视觉语义编码编码首先用于根据人的语义分解信息定位人体身体和服装区域。然后,一个人类语义注意模块(HSA)可以突出人的语义改进,并重新标注视觉特征地图。此外,一个视觉服装最不相关的格式,在Seximeal ex ex ex ex lade ex lade lade lade lade lade lade lade lade lade lade lade lade lade lade lade lade lade lade lade lade lade lade lade lade la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la la