Deep Learning architectures, albeit successful in most computer vision tasks, were designed for data with an underlying Euclidean structure, which is not usually fulfilled since pre-processed data may lie on a non-linear space. In this paper, we propose a geometry aware deep learning approach for skeleton-based action recognition. Skeleton sequences are first modeled as trajectories on Kendall's shape space and then mapped to the linear tangent space. The resulting structured data are then fed to a deep learning architecture, which includes a layer that optimizes over rigid and non rigid transformations of the 3D skeletons, followed by a CNN-LSTM network. The assessment on two large scale skeleton datasets, namely NTU-RGB+D and NTU-RGB+D 120, has proven that proposed approach outperforms existing geometric deep learning methods and is competitive with respect to recently published approaches.
翻译:深层学习结构虽然在大多数计算机愿景任务中都取得了成功,但设计用于具有基本的欧几里德结构的数据,而这种结构通常没有实现,因为预处理的数据可能位于非线性空间。在本文件中,我们建议采用深深深学习方法进行基于骨骼的行动识别。Skeleton序列首先作为Kendall形状空间的轨迹建模,然后被映射到线性近距离空间。由此产生的结构数据随后被输入深层学习结构,其中包括一个层,优化3D骨骼的硬质和非硬质转换,随后是CNN-LSTM网络。对两个大型骨骼数据集(NTU-RGB+D和NTU-RGB+D 120)的评估证明,拟议的方法超越了现有的几里深学习方法,并且对最近公布的方法具有竞争力。