Action recognition based on skeleton data has recently witnessed increasing attention and progress. State-of-the-art approaches adopting Graph Convolutional networks (GCNs) can effectively extract features on human skeletons relying on the pre-defined human topology. Despite associated progress, GCN-based methods have difficulties to generalize across domains, especially with different human topological structures. In this context, we introduce UNIK, a novel skeleton-based action recognition method that is not only effective to learn spatio-temporal features on human skeleton sequences but also able to generalize across datasets. This is achieved by learning an optimal dependency matrix from the uniform distribution based on a multi-head attention mechanism. Subsequently, to study the cross-domain generalizability of skeleton-based action recognition in real-world videos, we re-evaluate state-of-the-art approaches as well as the proposed UNIK in light of a novel Posetics dataset. This dataset is created from Kinetics-400 videos by estimating, refining and filtering poses. We provide an analysis on how much performance improves on smaller benchmark datasets after pre-training on Posetics for the action classification task. Experimental results show that the proposed UNIK, with pre-training on Posetics, generalizes well and outperforms state-of-the-art when transferred onto four target action classification datasets: Toyota Smarthome, Penn Action, NTU-RGB+D 60 and NTU-RGB+D 120.
翻译:基于骨骼数据的行动认识最近引起越来越多的关注和进展。采用图表革命网络(GCNs)的最先进方法能够有效地提取人类骨骼的特征。尽管取得了相关进展,但基于GCN的方法难以在各个领域,特别是不同的人类地形结构中推广。在这方面,我们引入了基于骨架的新颖的基于骨架的行动识别方法UNIK,它不仅能够有效地学习人体骨架序列的时空特征,而且还能够将各数据集普遍化。这是通过在多头关注机制的基础上从统一分布中学习最佳依赖矩阵来实现的。随后,为了研究现实世界视频中基于骨架的行动识别的交叉通用性,我们重新评价了基于骨架的方法以及拟议的基于骨架的行动识别方法。这个数据集通过估算、精炼和过滤配置KNITS-400视频创建。我们分析了在对60先头值的配置下,在Stary-DG+D目标分类中,在将Stary-G-TERB之前,在对四项行动进行试点前的测试后,将SER-TAAAAAAAAAAAAAAAAAAAA前, 之后,对拟议的NTIG-PERADADAAAAADADADADADADADADADADADADADADADADADAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA,我们。我们。我们ADA,我们。我们A,我们ADADADADA,我们重新评估了拟议的“ADADADADADADADADADADADADADADADADADADADADADADAADADADAAAAADADADADADADAAAAADAAAAAADAADADADAAAAAAAADADADADADADADADADAAADAAAAAAAAAAAAAAA