Graph convolutional networks (GCNs) have been the predominant methods in skeleton-based human action recognition, including human-human interaction recognition. However, when dealing with interaction sequences, current GCN-based methods simply split the two-person skeleton into two discrete graphs and perform graph convolution separately as done for single-person action classification. Such operations ignore rich interactive information and hinder effective spatial inter-body relationship modeling. To overcome the above shortcoming, we introduce a novel unified two-person graph to represent inter-body and intra-body correlations between joints. Experiments show accuracy improvements in recognizing both interactions and individual actions when utilizing the proposed two-person graph topology. In addition, We design several graph labeling strategies to supervise the model to learn discriminant spatial-temporal interactive features. Finally, we propose a two-person graph convolutional network (2P-GCN). Our model achieves state-of-the-art results on four benchmarks of three interaction datasets: SBU, interaction subsets of NTU-RGB+D and NTU-RGB+D 120.
翻译:然而,在处理互动序列时,目前以GCN为基础的方法只是将两个人的骨架分割成两个离散的图形,并像单人行动分类那样分别进行图变。这类操作忽视丰富的互动信息,妨碍有效的空间跨机构关系建模。为了克服上述缺陷,我们引入了一个新型的双人统一图,以代表各种联合体之间的体际和体内关联。实验显示,在使用拟议的双人图示表层学时,既承认互动,又承认个人行动的准确性都有提高。此外,我们设计了几个图形标签战略,以监督模型,学习相异的时空互动特征。最后,我们提议了一个双人图共变网络(2P-GCN)。我们的模型在三个互动数据集的4个基准上取得了最新的结果:SBU、NTU-RGB+D的互动分集和NTU-RGB+D120。