Learning representative, robust and discriminative information from images is essential for effective person re-identification (Re-Id). In this paper, we propose a compound approach for end-to-end discriminative deep feature learning for person Re-Id based on both body and hand images. We carefully design the Local-Aware Global Attention Network (LAGA-Net), a multi-branch deep network architecture consisting of one branch for spatial attention, one branch for channel attention, one branch for global feature representations and another branch for local feature representations. The attention branches focus on the relevant features of the image while suppressing the irrelevant backgrounds. In order to overcome the weakness of the attention mechanisms, equivariant to pixel shuffling, we integrate relative positional encodings into the spatial attention module to capture the spatial positions of pixels. The global branch intends to preserve the global context or structural information. For the the local branch, which intends to capture the fine-grained information, we perform uniform partitioning to generate stripes on the conv-layer horizontally. We retrieve the parts by conducting a soft partition without explicitly partitioning the images or requiring external cues such as pose estimation. A set of ablation study shows that each component contributes to the increased performance of the LAGA-Net. Extensive evaluations on four popular body-based person Re-Id benchmarks and two publicly available hand datasets demonstrate that our proposed method consistently outperforms existing state-of-the-art methods.
翻译:摘要:从图像中学习代表性、鲁棒性和区分性信息对于有效的人物再辨识(Re-Id)至关重要。本文提出了一种综合方法,基于身体和手部图像进行端到端鉴别性深度特征学习,用于人物Re-Id。我们精心设计了本地感知全局关注网络(LAGA-Net),一种多支深层网络架构,它包括一个支路用于空间注意,一个支路用于通道注意,一个支路用于全局特征表示和另一个支路用于本地特征表示。注意力支路专注于图像的相关特征,同时抑制无关背景。为了克服注意机制对像素重排的不足,我们将相对位置编码集成到空间注意模块中,以捕获像素的空间位置。全局支路旨在保留全局上下文或结构信息。对于本地支路,旨在捕获细粒度信息,我们对卷积层进行水平统一分区生成条纹。我们通过进行软分区而不需要明确地分割图像或需要姿势估计等外部线索来检索部件。一组消融研究表明,LAGA-Net的每个组件都有助于提高性能。对四个流行的基于身体的人物Re-Id基准和两个公开可用的手数据集进行的广泛评估表明,我们提出的方法始终优于现有的最先进方法。