Human identification is an important topic in event detection, person tracking, and public security. There have been numerous methods proposed for human identification, such as face identification, person re-identification, and gait identification. Typically, existing methods predominantly classify a queried image to a specific identity in an image gallery set (I2I). This is seriously limited for the scenario where only a textual description of the query or an attribute gallery set is available in a wide range of video surveillance applications (A2I or I2A). However, very few efforts have been devoted towards modality-free identification, i.e., identifying a query in a gallery set in a scalable way. In this work, we take an initial attempt, and formulate such a novel Modality-Free Human Identification (named MFHI) task as a generic zero-shot learning model in a scalable way. Meanwhile, it is capable of bridging the visual and semantic modalities by learning a discriminative prototype of each identity. In addition, the semantics-guided spatial attention is enforced on visual modality to obtain representations with both high global category-level and local attribute-level discrimination. Finally, we design and conduct an extensive group of experiments on two common challenging identification tasks, including face identification and person re-identification, demonstrating that our method outperforms a wide variety of state-of-the-art methods on modality-free human identification.
翻译:人类身份识别是事件检测、人员跟踪和公共安全方面的一个重要专题。提出了许多人类身份识别方法,例如脸部识别、个人重新识别和步态识别。通常,现有方法主要将一个被查询的图像归类为图像画廊(I2I)中的特定身份。对于只有查询的文字描述或属性画廊可在广泛的视频监视应用(A2I或I2A)中找到的情况来说,这受到严重限制。然而,在无模式识别方面,花在无模式识别上的努力很少,即以可缩放的方式在画廊中找到查询。在这项工作中,我们首先尝试,并设计出这样一个新型的无身份识别模式(MFHI),作为可缩放的通用零光学学习模式。与此同时,它能够通过学习每种身份的歧视性原型(A2I或I2A)来连接视觉和语界模式。此外,在视觉模式上,即以可缩放的方式与高全球类别和地方属性水平的歧视进行陈述。我们首先尝试,并设计出一种具有挑战性特征识别模式的通用的人类身份识别模式。最后,我们设计并进行广泛的群体演示了一种具有挑战性特征识别方式的多样化的人类身份识别方法。