Research on group activity recognition mostly leans on the standard two-stream approach (RGB and Optical Flow) as their input features. Few have explored explicit pose information, with none using it directly to reason about the persons interactions. In this paper, we leverage the skeleton information to learn the interactions between the individuals straight from it. With our proposed method GIRN, multiple relationship types are inferred from independent modules, that describe the relations between the body joints pair-by-pair. Additionally to the joints relations, we also experiment with the previously unexplored relationship between individuals and relevant objects (e.g. volleyball). The individuals distinct relations are then merged through an attention mechanism, that gives more importance to those individuals more relevant for distinguishing the group activity. We evaluate our method in the Volleyball dataset, obtaining competitive results to the state-of-the-art. Our experiments demonstrate the potential of skeleton-based approaches for modeling multi-person interactions.
翻译:关于群体活动认识的研究大多注重标准双流方法(RGB和光学流动)作为输入特征。很少有人探讨过明确显示信息,没有直接利用信息来解释个人互动情况。在本文中,我们利用骨架信息来直接了解个人之间的互动情况。我们提议的GIRN方法,从独立模块中推断出多种关系类型,描述身体对对对对对对对对关系。除了这些连接关系外,我们还试验了以前未探索的个人与相关对象(如排球)之间的关系。个人的不同关系随后通过关注机制被合并,从而更加重视那些与区分群体活动更相关的个人。我们在Volleyball数据集中评估了我们的方法,从中获得了最新技术的竞争结果。我们的实验展示了以骨架为基础模拟多人互动的潜力。