Person Re-Identification (ReID) requires comparing two images of person captured under different conditions. Existing work based on neural networks often computes the similarity of feature maps from one single convolutional layer. In this work, we propose an efficient, end-to-end fully convolutional Siamese network that computes the similarities at multiple levels. We demonstrate that multi-level similarity can improve the accuracy considerably using low-complexity network structures in ReID problem. Specifically, first, we use several convolutional layers to extract the features of two input images. Then, we propose Convolution Similarity Network to compute the similarity score maps for the inputs. We use spatial transformer networks (STNs) to determine spatial attention. We propose to apply efficient depth-wise convolution to compute the similarity. The proposed Convolution Similarity Networks can be inserted into different convolutional layers to extract visual similarities at different levels. Furthermore, we use an improved ranking loss to further improve the performance. Our work is the first to propose to compute visual similarities at low, middle and high levels for ReID. With extensive experiments and analysis, we demonstrate that our system, compact yet effective, can achieve competitive results with much smaller model size and computational complexity.
翻译:个人再识别( ReID) 需要比较在不同条件下捕获的两种个人图像。 基于神经网络的现有工作通常计算一个进化层的相近性地貌图。 在这项工作中,我们建议建立一个高效的、端到端的全革命性暹粒网络,计算多层次的相似性。我们证明,在ReID问题中,多层次的相似性可以使用低复杂性网络结构来大大提高准确性。具体地说,首先,我们使用几个进化层来提取两个输入图像的特征。然后,我们建议“变异相似性网络”来计算输入的相近性得分图。我们使用空间变异性网络来确定空间关注度。我们建议应用高效的深度到端到端的全变异性网络来计算相似性。我们提议的进化相似性网络可以插入不同的进化层,在不同层次上提取视觉相似性。此外,我们用改进的排名损失来进一步改进性能。我们的工作是首先提出在低、中、高层次上对ReID进行相近度的相近似性评分图。我们可以用广泛而具有竞争性的实验和分析的系统,我们能够有效地进行精确的计算。