In person search, we aim to localize a query person from one scene in other gallery scenes. The cost of this search operation is dependent on the number of gallery scenes, making it beneficial to reduce the pool of likely scenes. We describe and demonstrate the Gallery Filter Network (GFN), a novel module which can efficiently discard gallery scenes from the search process, and benefit scoring for persons detected in remaining scenes. We show that the GFN is robust under a range of different conditions by testing on different retrieval sets, including cross-camera, occluded, and low-resolution scenarios. In addition, we develop the base SeqNeXt person search model, which improves and simplifies the original SeqNet model. We show that the SeqNeXt+GFN combination yields significant performance gains over other state-of-the-art methods on the standard PRW and CUHK-SYSU person search datasets. To aid experimentation for this and other models, we provide standardized tooling for the data processing and evaluation pipeline typically used for person search research.
翻译:在搜索过程中,我们的目标是将一个来自不同场景的查询人定位到其他画廊场景中。这项搜索行动的成本取决于画廊场景的数量,从而有利于减少可能的场景。我们描述并展示了画廊过滤网络(GFN),这是一个新颖的模块,可以有效地从搜索过程中丢弃画廊场场景,并有利于在其余场景中探测到的人的评分。我们通过测试不同的检索机组,包括交叉相机、隐蔽和低分辨率情景,显示新生力量在不同条件下是强大的。此外,我们还开发了SeqNeXt基础人搜索模型,改进和简化了原始SeqNet模型。我们显示SeqNeXt+GFN组合在标准PRW和CUHK-SYSU人搜索数据集方面比其他最先进的方法产生显著的绩效收益。为了协助这一模型和其他模型的实验,我们为通常用于人员搜索的数据处理和评价管道提供了标准化的工具。