项目名称: 基于深度学习的时序3D深度图动作语义理解
项目编号: No.61301299
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 无线电电子学、电信技术
项目作者: 季怡
作者单位: 苏州大学
项目金额: 24万元
中文摘要: 人类视觉系统基于眼睛获得的色彩,形状以及深度等信息,通过人脑的分析来获得对物体及动作的抽象语义。而通过机器学习来模拟这一过程对于智能监控,人机交互,视频检索等方面有重要的作用和意义。为此,本项目提出利用深度图和传统视频数据相结合,并使用深度学习来模拟人脑的多层神经元传递过程来实现对于人体动作不断变化过程中的动态语义理解。 研究内容及创新点体现在:1)用深度信念网实现对人体姿态从底层特征到抽象认知的多层无监督学习过程 2)将传统彩色视频数据和立体深度数据结合来构成多源竞争网络来模拟大脑皮层的视觉感知 3)通过对输入流在时间序列上的多层自学网络来模拟神经系统对于人体行为获得,分段,抽象,识别和理解的逐步认知过程。 这一基于感知,识别,记忆过程的系统不但可以提供机器视觉上高效的学习机制和识别能力,还可以进一步扩展及结合听觉,触觉等等多方面信道。
中文关键词: 图像行为分析;深度图;深度学习;机器视觉;
英文摘要: Based on color,shape or depth information from two eyes, human visual system obtains the abstract understanding of object and its activity through the analysis process of brains. Machine learning can immitate this process and occupy an important role in intelligent surveillance, human-machine ineraction and video analysis. This project propose to combine depth images with traditional video data and use deep learning to imitate the multi-layer neural network of human brain to understand the human behavior in a long hybrid sequence. The research topics and novelties are: 1) using deep belief netword to realize the process of unsuperviored learning; 2) combine traditional colorful video data and 3D depth image in a competitional netword to immitate the visual perception of human brains. 3) for input sequences of hybrid media, use multi-layer self-taughter netword to hybrid, detect cut, abtract concepts and recognition. Based on this precess of perception, recognition and memory, this system can not only improve the learning ability and recognition skill in computer vision, but also can be extended to broader areas such as touch or hearing.
英文关键词: Activity Analysis;Deep Learning;Depth Images;Conputer Vision;