Person search aims to jointly localize and identify a query person from natural, uncropped images, which has been actively studied in the computer vision community over the past few years. In this paper, we delve into the rich context information globally and locally surrounding the target person, which we refer to scene and group context, respectively. Unlike previous works that treat the two types of context individually, we exploit them in a unified global-local context network (GLCNet) with the intuitive aim of feature enhancement. Specifically, re-ID embeddings and context features are enhanced simultaneously in a multi-stage fashion, ultimately leading to enhanced, discriminative features for person search. We conduct the experiments on two person search benchmarks (i.e., CUHK-SYSU and PRW) as well as extend our approach to a more challenging setting (i.e., character search on MovieNet). Extensive experimental results demonstrate the consistent improvement of the proposed GLCNet over the state-of-the-art methods on the three datasets. Our source codes, pre-trained models, and the new setting for character search are available at: https://github.com/ZhengPeng7/GLCNet.
翻译:个人搜索的目的是共同定位和识别来自自然、未编织图像的查询人,过去几年来计算机视觉界对此进行了积极研究。在本文中,我们分别参照现场和群体背景,探讨了全球和当地围绕目标人的丰富背景信息。与以前分别处理两种类型背景的工作不同,我们利用这些背景进行统一的全球-地方背景网络(GLCNet),其直观目的是增强特征。具体地说,重新ID嵌入和上下文功能以多阶段方式同时得到加强,最终导致个人搜索的强化和歧视性特征。我们用两种人搜索基准(即CUHK-SYSU和PRW)进行实验,并将我们的方法扩大到更具挑战性的环境(即MovileNet的字符搜索)。广泛的实验结果显示,拟议的GLCNet与三个数据集上的最新技术方法相比,不断改进。我们的源代码、预先培训模型和字符搜索新设置可查到: https://gimub.Zcomeng/LC7.LCNet。