Existing action recognition methods mainly focus on joint and bone information in human body skeleton data due to its robustness to complex backgrounds and dynamic characteristics of the environments. In this paper, we combine body skeleton data with spatial and motion features from face and two hands, and present "Deep Action Stamps (DeepActs)", a novel data representation to encode actions from video sequences. We also present "DeepActsNet", a deep learning based ensemble model which learns convolutional and structural features from Deep Action Stamps for highly accurate action recognition. Experiments on three challenging action recognition datasets (NTU60, NTU120, and SYSU) show that the proposed model trained using Deep Action Stamps produce considerable improvements in the action recognition accuracy with less computational cost compared to the state-of-the-art methods.
翻译:现有行动识别方法主要侧重于人体骨骼数据中的联合和骨骼信息,因为它对复杂背景和环境动态特征具有很强的力度。在本文件中,我们将人体骨骼数据与面部和两只手的空间和运动特征结合起来,并展示了“深行动印章(Dep Action Stamps)(Dep Applices) ”, 这是将视频序列的行动编码起来的新数据代表。我们还展示了“深行动网”, 这是一种基于深行动印章的深学习混合模型,从深行动印章中学习进化和结构特征,以便非常准确的行动识别。关于三种具有挑战性的行动识别数据集(NTU60、NTU120和SYSU)的实验表明,使用深行动印章培训的拟议模型在行动识别精确度方面产生了相当大的改进,与最新方法相比,计算成本较低。