Graph convolutional networks (GCNs) have been very successful in modeling non-Euclidean data structures, like sequences of body skeletons forming actions modeled as spatio-temporal graphs. Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action. This leads to a high number of floating point operations (ranging from 16G to 100G FLOPs) to process a single sample, making their adoption in restricted computation application scenarios infeasible. In this paper, we propose a temporal attention module (TAM) for increasing the efficiency in skeleton-based action recognition by selecting the most informative skeletons of an action at the early layers of the network. We incorporate the TAM in a light-weight GCN topology to further reduce the overall number of computations. Experimental results on two benchmark datasets show that the proposed method outperforms with a large margin the baseline GCN-based method while having 2.9 times less number of computations. Moreover, it performs on par with the state-of-the-art with up to 9.6 times less number of computations.
翻译:图形革命网络(GCNs)非常成功地模拟了非日光度数据结构,如作为时空图形模型模拟的动作的人体骨骼序列。基于GCN的大多数行动识别方法使用计算复杂度高的深进化前网络处理所有骨骼。这导致大量浮点作业(从16G到100G FLOPs不等)处理单一样本,使其无法在限制性计算应用假设中采用。在本文中,我们建议采用一个时间关注模块(TAM),通过选择网络早期层行动最丰富的信息骨骼来提高基于骨骼的行动识别效率。我们将TAM纳入一个轻量的GCNN表层,以进一步减少计算总数。两个基准数据集的实验结果显示,拟议方法在很大的空间上超越了基于GCN的基线方法,而其计算次数则减少了2.9倍。此外,它与州级相比,其计算次数减少到9.6倍。