Most existing group activity recognition methods construct spatial-temporal relations merely based on visual representation. Some methods introduce extra knowledge, such as action labels, to build semantic relations and use them to refine the visual presentation. However, the knowledge they explored just stay at the semantic-level, which is insufficient for pursing notable accuracy. In this paper, we propose to exploit knowledge concretization for the group activity recognition, and develop a novel Knowledge Augmented Relation Inference framework that can effectively use the concretized knowledge to improve the individual representations. Specifically, the framework consists of a Visual Representation Module to extract individual appearance features, a Knowledge Augmented Semantic Relation Module explore semantic representations of individual actions, and a Knowledge-Semantic-Visual Interaction Module aims to integrate visual and semantic information by the knowledge. Benefiting from these modules, the proposed framework can utilize knowledge to enhance the relation inference process and the individual representations, thus improving the performance of group activity recognition. Experimental results on two public datasets show that the proposed framework achieves competitive performance compared with state-of-the-art methods.
翻译:多数现有群体活动识别方法仅以视觉表现为基础,构建空间-时空关系。有些方法引入了额外知识,如行动标签,以建立语义关系,并利用这些额外知识完善视觉演示。然而,它们所探索的知识仅仅停留在语义层面,不足以达到显著的准确性。在本文中,我们提议利用知识具体化来认识群体活动,并开发一个新的知识增强关系参照框架,能够有效地利用集成知识来改进个人表述。具体而言,该框架包括用于提取个人外观特征的视觉代表模块、知识增强语义关系模块探索个体行动的语义表达方式,以及知识-语义-视觉互动模块,目的是通过知识整合视觉和语义信息。从这些模块中受益,拟议框架可以利用知识加强推论过程和个人表述之间的关系,从而改进群体活动识别的绩效。两个公共数据集的实验结果显示,拟议框架实现了与州-艺术方法相比的竞争性业绩。</s>