Cross-spectral person re-identification, which aims to associate identities to pedestrians across different spectra, faces a main challenge of the modality discrepancy. In this paper, we address the problem from both image-level and feature-level in an end-to-end hybrid learning framework named robust feature mining network (RFM). In particular, we observe that the reflective intensity of the same surface in photos shot in different wavelengths could be transformed using a linear model. Besides, we show the variable linear factor across the different surfaces is the main culprit which initiates the modality discrepancy. We integrate such a reflection observation into an image-level data augmentation by proposing the linear transformation generator (LTG). Moreover, at the feature level, we introduce a cross-center loss to explore a more compact intra-class distribution and modality-aware spatial attention to take advantage of textured regions more efficiently. Experiment results on two standard cross-spectral person re-identification datasets, i.e., RegDB and SYSU-MM01, have demonstrated state-of-the-art performance.
翻译:跨光谱人重新识别的目的是将身份与不同光谱的行人联系起来,因此面临模式差异的主要挑战。在本文中,我们从图像层面和特征层面,在名为强力特征采矿网络(RFM)的端至端混合学习框架中,处理图像层面和特征层面的问题。我们特别注意到,不同波长照片中同一表面的反射强度可以用线性模型改变。此外,我们显示不同表面的可变线性系数是造成模式差异的主要罪魁祸首。我们通过提议线性转换生成器(LTG),将这种反射观测纳入图像层面的数据增强中。此外,在特征层面,我们引入了跨中心损失,以探索更为紧凑的班级内部分布和模式空间关注,以更有效地利用发光区域。关于两个标准的跨光谱人再识别数据集(即RegDB和SYSU-M01)的实验结果显示了最新表现。