Person Re-Identification is a challenging task that aims to retrieve all instances of a query image across a system of non-overlapping cameras. Due to the various extreme changes of view, it is common that local regions that could be used to match people are suppressed, which leads to a scenario where approaches have to evaluate the similarity of images based on less informative regions. In this work, we introduce the Top-DB-Net, a method based on Top DropBlock that pushes the network to learn to focus on the scene foreground, with special emphasis on the most task-relevant regions and, at the same time, encodes low informative regions to provide high discriminability. The Top-DB-Net is composed of three streams: (i) a global stream encodes rich image information from a backbone, (ii) the Top DropBlock stream encourages the backbone to encode low informative regions with high discriminative features, and (iii) a regularization stream helps to deal with the noise created by the dropping process of the second stream, when testing the first two streams are used. Vast experiments on three challenging datasets show the capabilities of our approach against state-of-the-art methods. Qualitative results demonstrate that our method exhibits better activation maps focusing on reliable parts of the input images.
翻译:个人重新识别是一项具有挑战性的任务,目的是在非重叠相机系统中检索所有查询图像的事例。由于各种极端的观点变化,通常可以用来与人匹配的本地区域会受到压制,从而导致一种必须评估基于信息较少区域图像相似性的方法的情景。在这项工作中,我们引入了基于Top-DB-Net,这种方法以Top-DB-Net为基础,推动网络学习关注地面景点,特别强调任务最相关的区域,同时将低信息区域编码为低信息区域,以提供高度不均匀性。Top-DB-Net由三种流组成:(一) 全球流将来自骨干的大量图像信息编码成全球流,(二) Top-DBlock 流鼓励骨干将低信息区域编码为具有高度歧视特征的低信息区域,以及(三) 正规化流有助于在测试前两个流时,处理第二流下降过程所产生的噪音。在三个富有挑战性的数据集上进行的实验展示了我们推进性图像的方法的能力。