佩德斯地产确认的空间和语义一致性 (Spatial and Semantic Consistency Regularizations for Pedestrian Attribute Recognition)

While recent studies on pedestrian attribute recognition have shown remarkable progress in leveraging complicated networks and attention mechanisms, most of them neglect the inter-image relations and an important prior: spatial consistency and semantic consistency of attributes under surveillance scenarios. The spatial locations of the same attribute should be consistent between different pedestrian images, \eg, the ``hat" attribute and the ``boots" attribute are always located at the top and bottom of the picture respectively. In addition, the inherent semantic feature of the ``hat" attribute should be consistent, whether it is a baseball cap, beret, or helmet. To fully exploit inter-image relations and aggregate human prior in the model learning process, we construct a Spatial and Semantic Consistency (SSC) framework that consists of two complementary regularizations to achieve spatial and semantic consistency for each attribute. Specifically, we first propose a spatial consistency regularization to focus on reliable and stable attribute-related regions. Based on the precise attribute locations, we further propose a semantic consistency regularization to extract intrinsic and discriminative semantic features. We conduct extensive experiments on popular benchmarks including PA100K, RAP, and PETA. Results show that the proposed method performs favorably against state-of-the-art methods without increasing parameters.

翻译：虽然最近关于行人属性承认的研究显示,在利用复杂的网络和关注机制方面取得了显著进展,但大多数都忽视了图像间的关系和重要的先期关系:监视情景下各属性的空间一致性和语义一致性;同一属性的空间位置应当在不同行人图像之间保持一致,例如“hat”属性和“boots”属性之间应始终位于相片的顶部和底部;此外,“hat”属性的内在语义特征应当一致,无论是棒球帽、贝雷特还是头盔。为了在示范学习过程中充分利用图像间关系和人类之前的汇总,我们建立了一个空间和语义一致性框架,其中包括实现每个属性的空间和语义一致性的两个互补规范。具体地说,我们首先建议空间一致性规范,侧重于可靠和稳定的属性相关区域。根据准确的属性位置,我们进一步建议对语义一致性进行规范,以提取内在的和歧视性的语义特征。为了在模型学习过程中充分利用图像关系和人类之前的综合人际关系,我们针对大众基准进行了广泛的实验,包括PA100K、PAT-RAP和PA结果的拟议方法,而没有显示正在采用的状态。