In the context of skeleton-based action recognition, graph convolutional networks (GCNs) have been rapidly developed, whereas convolutional neural networks (CNNs) have received less attention. One reason is that CNNs are considered poor in modeling the irregular skeleton topology. To alleviate this limitation, we propose a pure CNN architecture named Topology-aware CNN (Ta-CNN) in this paper. In particular, we develop a novel cross-channel feature augmentation module, which is a combo of map-attend-group-map operations. By applying the module to the coordinate level and the joint level subsequently, the topology feature is effectively enhanced. Notably, we theoretically prove that graph convolution is a special case of normal convolution when the joint dimension is treated as channels. This confirms that the topology modeling power of GCNs can also be implemented by using a CNN. Moreover, we creatively design a SkeletonMix strategy which mixes two persons in a unique manner and further boosts the performance. Extensive experiments are conducted on four widely used datasets, i.e. N-UCLA, SBU, NTU RGB+D and NTU RGB+D 120 to verify the effectiveness of Ta-CNN. We surpass existing CNN-based methods significantly. Compared with leading GCN-based methods, we achieve comparable performance with much less complexity in terms of the required GFLOPs and parameters.
翻译:在基于骨骼的行动识别方面,图形革命网络(GCN)得到迅速发展,而革命神经网络(CNN)得到的关注较少,原因之一是CNN被认为在模拟不正常的骨骼地形学方面表现不佳。为了减轻这一限制,我们提议在本文中建立一个纯粹的CNN架构,名为Topolology-aware CNN (Ta-CNN) CNN (Ta-CNN) 。特别是,我们开发了一个新型的跨通道特征增强模块,这是地图-图集群映射操作的组合。通过将该模块应用于协调级别和随后的联合级别,其表层特征得到有效增强。值得注意的是,我们理论上证明,当将联合层面作为渠道处理时,图形革命是正常演进的一个特殊案例。这证实,GCN的地形建模能力也可以通过使用CNNCNN。此外,我们创造性地设计了SkeletonMix战略,以独特的方式将两个人混在一起,并进一步提升了绩效。通过在四种广泛使用的数据集,即N-OP-N-N-N-N-CLU-BA、S-BU-N-BU的大幅核查现有标准的比效方法,我们更低的NGB-GB-BU-BU-BU-B-BU-BU-BU-BU-B-B-C-C-C-C-B-BBB-B-B-B-B-B-BBB-B-B-M-B的成绩的成绩的大幅地的成绩的比比方法比。