Visible-Infrared person re-identification (VI-ReID) aims to match cross-modality pedestrian images, breaking through the limitation of single-modality person ReID in dark environment. In order to mitigate the impact of large modality discrepancy, existing works manually design various two-stream architectures to separately learn modality-specific and modality-sharable representations. Such a manual design routine, however, highly depends on massive experiments and empirical practice, which is time consuming and labor intensive. In this paper, we systematically study the manually designed architectures, and identify that appropriately separating Batch Normalization (BN) layers is the key to bring a great boost towards cross-modality matching. Based on this observation, the essential objective is to find the optimal separation scheme for each BN layer. To this end, we propose a novel method, named Cross-Modality Neural Architecture Search (CM-NAS). It consists of a BN-oriented search space in which the standard optimization can be fulfilled subject to the cross-modality task. Equipped with the searched architecture, our method outperforms state-of-the-art counterparts in both two benchmarks, improving the Rank-1/mAP by 6.70%/6.13% on SYSU-MM01 and by 12.17%/11.23% on RegDB. Code is released at https://github.com/JDAI-CV/CM-NAS.
翻译:可见的红外人再识别(VI-REID)旨在匹配跨现代行人图像(VI-REID),打破在黑暗环境中单一时装人再识别的限制,打破单一时装人在黑暗环境中的自我再识别的限制;为了减轻大规模模式差异的影响,现有作品手工设计了各种双流结构结构,分别学习具体模式和模式可分配的表达方式。然而,这种手工设计常规高度取决于大规模实验和经验实践,这是耗时和劳动密集的。在本文中,我们系统研究手工设计的建筑,并查明适当分离批次正常化(BN)层是大大推动跨模式匹配的关键。基于这一观察,基本目标是为每个BN层找到最佳分离计划。为此,我们提出了名为跨模式神经结构搜索(CM-NAS)的新方法。它包括一个以BN为主的搜索空间,在跨模式任务下可以实现标准优化。在搜索结构中,我们的方法比RB-C-CMM-RMA-R23/SIM-RIS-RARC-RIS-RBRC-RB-RB-RB-RM-RB-RD-RB-S-RB-RM-RM-RM-RM-R-R-S-S-RM-RM-RM-R-S-R-R-RM-RM-R-R-R-R-RM-S-S-S-S-S-R-R-R-RM-R-R-R-R-R-S-S-S-S-RM-RM-RM-R-R-S-S-S-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-RM-RM-S-R-S-S-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R-R