Person re identification is a challenging retrieval task that requires matching a person's acquired image across non overlapping camera views. In this paper we propose an effective approach that incorporates both the fine and coarse pose information of the person to learn a discriminative embedding. In contrast to the recent direction of explicitly modeling body parts or correcting for misalignment based on these, we show that a rather straightforward inclusion of acquired camera view and/or the detected joint locations into a convolutional neural network helps to learn a very effective representation. To increase retrieval performance, re-ranking techniques based on computed distances have recently gained much attention. We propose a new unsupervised and automatic re-ranking framework that achieves state-of-the-art re-ranking performance. We show that in contrast to the current state-of-the-art re-ranking methods our approach does not require to compute new rank lists for each image pair (e.g., based on reciprocal neighbors) and performs well by using simple direct rank list based comparison or even by just using the already computed euclidean distances between the images. We show that both our learned representation and our re-ranking method achieve state-of-the-art performance on a number of challenging surveillance image and video datasets. The code is available online at: https://github.com/pse-ecn/pose-sensitive-embedding
翻译:个人再识别是一项具有挑战性的检索任务,需要将一个人的既得图像与非重叠的相机视图相匹配。 在本文中,我们提出一种有效的方法,将一个人的精细和粗粗的表面信息都包含在内,以学习歧视性的嵌入。与最近明确模拟身体部件的方向相反,或根据这些部分进行调整,我们表明,将获得的相机视图和(或)检测到的联合位置直接纳入一个革命神经网络,有助于学习一种非常有效的表达方式。为了提高检索性能,基于计算距离的重新排序技术最近引起了很大的注意。我们提出了一个新的不受监管和自动的重新排序框架,实现最先进的重新排序性能。我们表明,与当前最先进的重新定位方法相比,我们的方法并不要求为每对一对图像配配以新的排序列表(例如,基于对等邻居)进行简单直接的排序列表,而且通过使用已经计算的图像间距的euclideidean距离来运行良好。我们所学过的演示的演示和重新排序方法都是在网上图像上实现的状态。