Group activity recognition aims to understand the activity performed by a group of people. In order to solve it, modeling complex spatio-temporal interactions is the key. Previous methods are limited in reasoning on a predefined graph, which ignores the inherent person-specific interaction context. Moreover, they adopt inference schemes that are computationally expensive and easily result in the over-smoothing problem. In this paper, we manage to achieve spatio-temporal person-specific inferences by proposing Dynamic Inference Network (DIN), which composes of Dynamic Relation (DR) module and Dynamic Walk (DW) module. We firstly propose to initialize interaction fields on a primary spatio-temporal graph. Within each interaction field, we apply DR to predict the relation matrix and DW to predict the dynamic walk offsets in a joint-processing manner, thus forming a person-specific interaction graph. By updating features on the specific graph, a person can possess a global-level interaction field with a local initialization. Experiments indicate both modules' effectiveness. Moreover, DIN achieves significant improvement compared to previous state-of-the-art methods on two popular datasets under the same setting, while costing much less computation overhead of the reasoning module.
翻译:团体活动识别旨在了解一群人所从事的活动。 为了解决这个问题, 建模复杂的时空互动是关键。 以前的方法在预定义的图表的推理上是有限的, 它忽略了固有的人与时空互动环境。 此外, 它们采用了计算成本昂贵且容易导致过度移动问题的推论方案。 在本文中, 我们通过提出动态推论网络( DIN), 包含动态关系模块和动态行走模块( DW) 的动态推论, 得以实现时空个人特异推论。 我们首先提议在原始的时空图上初始化互动字段。 在每一个互动字段中, 我们应用 DIN 来预测关系矩阵和 DW 以联合处理方式预测动态行走偏差, 从而形成个人特异的交互图。 通过更新特定图表的特征, 一个人可以拥有一个具有本地初始化的全球性互动字段。 实验显示两个模块的有效性。 此外, DIN 在主控模块下, 与先前的平价计算方法相比, 与先前的平价计算模式相比, 在两次普通的平价计算方法下, DIN 获得显著改进。