Recognizing and categorizing human actions is an important task with applications in various fields such as human-robot interaction, video analysis, surveillance, video retrieval, health care system and entertainment industry. This thesis presents a novel computational approach for human action recognition through different implementations of multi-layer architectures based on artificial neural networks. Each system level development is designed to solve different aspects of the action recognition problem including online real-time processing, action segmentation and the involvement of objects. The analysis of the experimental results are illustrated and described in six articles. The proposed action recognition architecture of this thesis is composed of several processing layers including a preprocessing layer, an ordered vector representation layer and three layers of neural networks. It utilizes self-organizing neural networks such as Kohonen feature maps and growing grids as the main neural network layers. Thus the architecture presents a biological plausible approach with certain features such as topographic organization of the neurons, lateral interactions, semi-supervised learning and the ability to represent high dimensional input space in lower dimensional maps. For each level of development the system is trained with the input data consisting of consecutive 3D body postures and tested with generalized input data that the system has never met before. The experimental results of different system level developments show that the system performs well with quite high accuracy for recognizing human actions.
翻译:对人类行动进行认知和分类是一项重要任务,涉及在人类机器人互动、视频分析、监视、视频检索、保健系统和娱乐业等各个领域的应用。这一论文为人类行动识别提供了一个新型的计算方法,通过以人工神经网络为基础的多层结构的不同实施,对人的行为进行了新的计算方法。每个系统层面的开发都旨在解决行动识别问题的不同方面,包括在线实时处理、行动分解和物体的参与。实验结果的分析在六篇文章中作了说明和描述。该论文的拟议行动识别结构由几个处理层组成,包括一个预处理层、一个定序矢量代表层和三个神经网络层。它利用了像Kokoonen地貌图这样的自我组织神经网络网络,以及作为主要神经网络层的网络不断增长。因此,该结构呈现了一种具有某些特征的生物合理方法,例如神经系统的地形组织、横向互动、半封闭式学习和在较低维度的地图中代表高维度输入空间的能力。该系统的每个开发阶段都经过由输入数据组成的培训,其中包括连续3D体特征图层图层图层图和三层神经图层图层图的不断的精确性研究,并测试了人类输入系统,这些系统从不曾通过不同的输入系统来显示高输入结果。