Human action recognition is an important application domain in computer vision. Its primary aim is to accurately describe human actions and their interactions from a previously unseen data sequence acquired by sensors. The ability to recognize, understand, and predict complex human actions enables the construction of many important applications such as intelligent surveillance systems, human-computer interfaces, health care, security, and military applications. In recent years, deep learning has been given particular attention by the computer vision community. This paper presents an overview of the current state-of-the-art in action recognition using video analysis with deep learning techniques. We present the most important deep learning models for recognizing human actions, and analyze them to provide the current progress of deep learning algorithms applied to solve human action recognition problems in realistic videos highlighting their advantages and disadvantages. Based on the quantitative analysis using recognition accuracies reported in the literature, our study identifies state-of-the-art deep architectures in action recognition and then provides current trends and open problems for future works in this field.
翻译:人类行动认知是计算机愿景中的一个重要应用领域,其主要目的是准确描述人类行动及其从传感器获得的先前不为人知的数据序列中产生的相互作用。认识、理解和预测复杂的人类行动的能力使得能够构建许多重要的应用,例如智能监测系统、人-计算机界面、保健、安全和军事应用。近年来,计算机视觉界特别重视深层次的学习。本文件利用深层学习技术的视频分析,概述了当前最先进的行动认知状态。我们展示了承认人类行动的最重要深层学习模式,并分析了这些模式,以现实的视频提供用于解决人类行动认知问题的深层学习算法目前的进展,突出其优劣之处。根据文献中报告的认知精度的定量分析,我们的研究查明了行动认知中最先进的深层结构,然后为该领域的未来工作提供了当前的趋势和开放的问题。