Typical person re-identification (ReID) methods usually describe each pedestrian with a single feature vector and match them in a task-specific metric space. However, the methods based on a single feature vector are not sufficient enough to overcome visual ambiguity, which frequently occurs in real scenario. In this paper, we propose a novel end-to-end trainable framework, called Dual ATtention Matching network (DuATM), to learn context-aware feature sequences and perform attentive sequence comparison simultaneously. The core component of our DuATM framework is a dual attention mechanism, in which both intra-sequence and inter-sequence attention strategies are used for feature refinement and feature-pair alignment, respectively. Thus, detailed visual cues contained in the intermediate feature sequences can be automatically exploited and properly compared. We train the proposed DuATM network as a siamese network via a triplet loss assisted with a de-correlation loss and a cross-entropy loss. We conduct extensive experiments on both image and video based ReID benchmark datasets. Experimental results demonstrate the significant advantages of our approach compared to the state-of-the-art methods.
翻译:典型的人重新识别(ReID)方法通常描述每个行人使用单一特性矢量的典型特征矢量,并在特定任务尺度空间中与之匹配,但基于单一特性矢量的方法不足以克服在真实情景中经常出现的视觉模糊性。在本文件中,我们提出一个新的端到端可训练框架,称为双重注意匹配网络(DuATM),以学习上下文特征序列,并同时进行关注序列比较。我们的DuATM框架的核心组成部分是一个双重关注机制,其中先后和顺序间注意战略分别用于地物改进和地物调整。因此,中间特性序列中的详细视觉提示可以自动利用并进行适当比较。我们将拟议的DuATM网络培训成一个三重损失网络,通过分解关系损失和交叉损失来协助进行。我们根据ReID基准数据集对图像和视频进行广泛的实验。实验结果表明,我们的方法与最先进的方法相比,具有显著的优势。