We investigate the problem of automatically determining what type of shoe left an impression found at a crime scene. This recognition problem is made difficult by the variability in types of crime scene evidence (ranging from traces of dust or oil on hard surfaces to impressions made in soil) and the lack of comprehensive databases of shoe outsole tread patterns. We find that mid-level features extracted by pre-trained convolutional neural nets are surprisingly effective descriptors for this specialized domains. However, the choice of similarity measure for matching exemplars to a query image is essential to good performance. For matching multi-channel deep features, we propose the use of multi-channel normalized cross-correlation and analyze its effectiveness. Our proposed metric significantly improves performance in matching crime scene shoeprints to laboratory test impressions. We also show its effectiveness in other cross-domain image retrieval problems: matching facade images to segmentation labels and aerial photos to map images. Finally, we introduce a discriminatively trained variant and fine-tune our system through our proposed metric, obtaining state-of-the-art performance.
翻译:我们调查了自动确定哪些类型的鞋在犯罪现场留下印象的问题。由于犯罪现场证据类型的变化(从硬表面灰尘或油的痕迹到土壤中的痕迹),以及缺乏全面的鞋类溢出胎面模式数据库,我们调查了自动确定哪些类型的鞋在犯罪现场留下的印象的问题。我们发现,预先训练的革命性神经网所提取的中层特征是这个专门领域的令人惊讶的有效描述器。然而,选择相似的测量方法将模拟器与查询图像相匹配对于良好性能至关重要。为了匹配多通道深层特征,我们提议使用多通道的标准化交叉交错和分析其有效性。我们拟议的指标极大地改进了将犯罪现场鞋印与实验室测试痕迹相匹配的性能。我们还显示了它在其他交叉图像检索问题上的有效性:将表面图像与分解标签和航空照片相匹配。最后,我们引入了一种经过有区别性培训的变式,并通过我们拟议的指标对我们的系统进行微调,以获得最先进的性能。