Human action recognition is an active research area in computer vision. Although great process has been made, previous methods mostly recognize actions based on depth data at only one scale, and thus they often neglect multi-scale features that provide additional information action recognition in practical application scenarios. In this paper, we present a novel framework focusing on multi-scale motion information to recognize human actions from depth video sequences. We propose a multi-scale feature map called Laplacian pyramid depth motion images(LP-DMI). We employ depth motion images (DMI) as the templates to generate the multi-scale static representation of actions. Then, we caculate LP-DMI to enhance multi-scale dynamic information of motions and reduces redundant static information in human bodies. We further extract the multi-granularity descriptor called LP-DMI-HOG to provide more discriminative features. Finally, we utilize extreme learning machine (ELM) for action classification. The proposed method yeilds the recognition accuracy of 93.41%, 85.12%, 91.94% on public MSRAction3D dataset, UTD-MHAD and DHA dataset. Through extensive experiments, we prove that our method outperforms state-of-the-art benchmarks.
翻译:人类行动是计算机视野中一个积极的研究领域。虽然已经做了巨大的过程,但以往的方法大多承认基于深度数据的行动,仅一个尺度,因此往往忽视了在实际应用情景中提供额外信息行动识别的多尺度功能。在本文件中,我们提出了一个新的框架,侧重于多尺度的运动信息,以通过深度视频序列来识别人类的行动。我们提出了一个称为Laplacian金字塔深度运动图像(LP-DMI)的多尺度特级地图。我们使用深度运动图像(DMI)作为模板来生成多尺度静态动作的演示。然后,我们对LP-DMI进行层层层拼凑,以加强关于运动的多尺度动态信息,并减少人体中多余的静态信息。我们进一步提取了称为LP-DMI-HOG的多层次描述器,以提供更具有歧视性的特征。最后,我们使用极端学习机(ELM)来进行行动分类。我们建议的方法将确认准确度为93.41%、85.12%、91.94%用于公共磁3D数据设置、UTD-MAD和D数据库数据库。通过广泛的实验来证明我们的标准。