Skeleton-based action recognition has attracted considerable attention due to its compact skeletal structure of the human body. Many recent methods have achieved remarkable performance using graph convolutional networks (GCNs) and convolutional neural networks (CNNs), which extract spatial and temporal features, respectively. Although spatial and temporal dependencies in the human skeleton have been explored, spatio-temporal dependency is rarely considered. In this paper, we propose the Inter-Frame Curve Network (IFC-Net) to effectively leverage the spatio-temporal dependency of the human skeleton. Our proposed network consists of two novel elements: 1) The Inter-Frame Curve (IFC) module; and 2) Dilated Graph Convolution (D-GC). The IFC module increases the spatio-temporal receptive field by identifying meaningful node connections between every adjacent frame and generating spatio-temporal curves based on the identified node connections. The D-GC allows the network to have a large spatial receptive field, which specifically focuses on the spatial domain. The kernels of D-GC are computed from the given adjacency matrices of the graph and reflect large receptive field in a way similar to the dilated CNNs. Our IFC-Net combines these two modules and achieves state-of-the-art performance on three skeleton-based action recognition benchmarks: NTU-RGB+D 60, NTU-RGB+D 120, and Northwestern-UCLA.
翻译:由于人体骨骼结构紧凑,基于Skeleton的行动认识吸引了相当多的关注。许多最近的方法都利用图形革命网络(GCNs)和进化神经网络(CNNs)取得了显著的绩效,它们分别提取了空间和时间特征。虽然已经探索了人体骨骼的空间和时间依赖性,但很少考虑空间与时间的时空依赖性。在本文件中,我们提议Frame Curve网络(IFC-Net)有效地利用人体骨骼的时空依赖性。我们提议的网络由两个新颖元素组成:1) 图形革命网络(IFC)和进化神经网络(CNNNNNNN)模块。IFC模块通过查明每个相邻框架之间的有意义的节点连接和产生时空脉动曲线曲线来增加空间空间-时空曲线。DGC允许网络拥有一个大型空间可接受域域域,具体侧重于空间域。D-FRC+D模块的内核内核数据库和SIM-D数据库的硬基模块,从我们的直径直径二等阵列动作组合组合组合,从这些直径直径直径直径直径直径直径直径直径直径方和直径直径直径。