Occlusion processing is a key issue in pedestrian attribute recognition (PAR). Nevertheless, several existing video-based PAR methods have not yet considered occlusion handling in depth. In this paper, we formulate finding non-occluded frames as sparsity-based temporal attention of a crowded video. In this manner, a model is guided not to pay attention to the occluded frame. However, temporal sparsity cannot include a correlation between attributes when occlusion occurs. For example, "boots" and "shoe color" cannot be recognized when the foot is invisible. To solve the uncorrelated attention issue, we also propose a novel group sparsity-based temporal attention module. Group sparsity is applied across attention weights in correlated attributes. Thus, attention weights in a group are forced to pay attention to the same frames. Experimental results showed that the proposed method achieved a higher F1-score than the state-of-the-art methods on two video-based PAR datasets and five occlusion scenarios.
翻译:隔离处理是行人属性识别( PAR) 中的一个关键问题。 尽管如此, 现有的一些基于视频的 PAR 方法尚未考虑深度隔离处理 。 在本文中, 我们将发现非隐蔽框架作为拥挤视频的聚光度时间关注度。 这样, 模型被引导不关注隐蔽框架 。 但是, 当隐蔽发生时, 时间宽度不能包括属性的关联性 。 例如, “ 启动” 和“ 鞋色” 等现有基于视频的PAR 方法在脚不为人知时无法被识别 。 为了解决与不相关关注问题, 我们还提出一个基于非隐蔽的群聚点时间关注模块 。 在相关属性中, 群聚度会被跨过关注权重 。 因此, 一组的注意权重被迫关注同一框架 。 实验结果表明, 拟议方法在两个视频基于 PAR 的数据集和五个隐蔽情景上取得了高于状态的F1- 核心 方法 。