Visible-Infrared person re-identification (VI-ReID) aims at matching cross-modality pedestrian images, breaking through the limitation of single-modality person ReID in dark environment. In order to mitigate the impact of large modality discrepancy, existing works manually design various two-stream architectures to separately learn modality-specific and modality-sharable representations. Such a manual design routine, however, highly depends on massive experiments and empirical practice, which is time consuming and labor intensive. In this paper, we systematically study the manually designed architectures, and identify that appropriately splitting Batch Normalization (BN) layers to learn modality-specific representations will bring a great boost towards cross-modality matching. Based on this observation, the essential objective is to find the optimal splitting scheme for each BN layer. To this end, we propose a novel method, named Cross-Modality Neural Architecture Search (CM-NAS). It consists of a BN-oriented search space in which the standard optimization can be fulfilled subject to the cross-modality task. Besides, in order to better guide the search process, we further formulate a new Correlation Consistency based Class-specific Maximum Mean Discrepancy (C3MMD) loss. Apart from the modality discrepancy, it also concerns the similarity correlations, which have been overlooked before, in the two modalities. Resorting to these advantages, our method outperforms state-of-the-art counterparts in extensive experiments, improving the Rank-1/mAP by 6.70%/6.13% on SYSU-MM01 and 12.17%/11.23% on RegDB. The source code will be released soon.
翻译:可见红外人再识别(VI-ReID)旨在匹配跨模式行人图像,打破在黑暗环境中对单一模式人再识别的限制。为了减轻模式差异的影响,现有作品手工设计了各种双流结构,以分别学习特定模式和可分配模式的表达方式。然而,这种手工设计常规高度取决于大规模实验和经验实践,这是耗时和劳动密集型的。在本文中,我们系统地研究手工设计的架构,并查明适当分割批量正常化(BN)层以学习特定模式的表达方式将极大地推动跨模式模式的匹配。基于这一观察,我们的基本目标是为BN层各层找到最佳的分裂方案。为此,我们提出了一个新颖的方法,名为跨模式神经结构搜索(CM-NAS) 。它是一个面向BN的搜索空间,可以在跨模式任务下实现标准优化。此外,为了更好地指导搜索进程,我们进一步制定了基于模式的跨模式的跨模式 AP3,在标准格式规则上,我们进一步制定了一种基于模式的更新方法,在排序之前,将披露。