Graph convolutional networks (GCNs) have been widely used and achieved remarkable results in skeleton-based action recognition. In GCNs, graph topology dominates feature aggregation and therefore is the key to extracting representative features. In this work, we propose a novel Channel-wise Topology Refinement Graph Convolution (CTR-GC) to dynamically learn different topologies and effectively aggregate joint features in different channels for skeleton-based action recognition. The proposed CTR-GC models channel-wise topologies through learning a shared topology as a generic prior for all channels and refining it with channel-specific correlations for each channel. Our refinement method introduces few extra parameters and significantly reduces the difficulty of modeling channel-wise topologies. Furthermore, via reformulating graph convolutions into a unified form, we find that CTR-GC relaxes strict constraints of graph convolutions, leading to stronger representation capability. Combining CTR-GC with temporal modeling modules, we develop a powerful graph convolutional network named CTR-GCN which notably outperforms state-of-the-art methods on the NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets.
翻译:在基于骨骼的行动识别中,广泛使用并取得了显著的成果。在GCN中,图形表层学占特征集合的主导地位,因此是提取代表性特征的关键。在这项工作中,我们建议采用新型的通道-地形再精化革命(CTR-GC),以动态方式学习不同的地形学,并在基于骨骼的行动识别的不同渠道中有效地汇总共同特征。拟议的CTR-GC模型渠道-轨道-地形学通过学习一种共享的地形学,作为所有渠道通用的先行,并用每个频道的特定相关关系加以完善。我们的精细化方法引入了少数额外参数,并大大降低了模拟频道型表层学的难度。此外,通过将图层变变成统一的形式,我们发现CTR-GC放松了对图形演化的严格限制,导致更强大的代表能力。将CTR-GC模型与时间建模模块相结合,我们开发了一个强大的图形革命网络,名为CTR-GCN,它明显超越了NTU GB+D、NGB+RGB-D数据120和NUD。