Gait recognition, which refers to the recognition or identification of a person based on their body shape and walking styles, derived from video data captured from a distance, is widely used in crime prevention, forensic identification, and social security. However, to the best of our knowledge, most of the existing methods use appearance, posture and temporal feautures without considering a learned temporal attention mechanism for global and local information fusion. In this paper, we propose a novel gait recognition framework, called Temporal Attention and Keypoint-guided Embedding (GaitTAKE), which effectively fuses temporal-attention-based global and local appearance feature and temporal aggregated human pose feature. Experimental results show that our proposed method achieves a new SOTA in gait recognition with rank-1 accuracy of 98.0% (normal), 97.5% (bag) and 92.2% (coat) on the CASIA-B gait dataset; 90.4% accuracy on the OU-MVLP gait dataset.
翻译:Gait 识别是指从远距离获取的视频数据中根据一个人的体形和行走风格对一个人的识别或识别,在预防犯罪、法医鉴定和社会保障方面广泛使用,然而,据我们所知,大多数现有方法都使用外观、姿态和时空形状,而没有考虑对全球和地方信息融合的学习时间关注机制。在本文件中,我们提议了一个创新的动作识别框架,称为“时间关注和关键点引导嵌入(GaitTakes)”,它有效地结合了基于时间注意的全球和当地外观特征和时间汇总的人类面貌特征。实验结果显示,我们拟议的方法取得了一个新的SOTA,其一级识别精确度为98.0%(正常)、97.5%(袋)和92.2%(coat)的CASIA-B Gait数据集;O-MVLP Gait数据集的准确度为90.4%。