项目名称: 深度学习框架下基于情境线索的视觉注意研究
项目编号: No.61502311
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 计算机科学学科
项目作者: 钟圣华
作者单位: 深圳大学
项目金额: 20万元
中文摘要: 视觉注意是视觉系统的门户,对视觉注意系统进行建模是一个重要而具有挑战的跨学科课题,对于计算科学、心理学、认知神经科学和其他相关领域都有重要影响。为了实现准确、高效、完备的视觉注意建模,本项目组以眼动轨迹预测作为突破口,提出完善的视觉注意计算的任务框架,并将情境线索加入影响因素体系,提出深度学习框架下基于情境线索的视觉注意模型。该模型能够对深度学习与情境线索深度融合,通过深度学习得到情境线索的多语义层次表征,并且将情境线索作为深度学习的规则化因子并指导深度学习,最终建立包含多语义层次的视觉注意计算方法。本项目将丰富和发展视觉注意计算的理论体系,提高计算机的特征提取、表征、理解与鉴别能力。本项目探索视觉系统注意机制的模式与成因,将为脑科学、特别是视觉科学基础理论的研究提供算法支持,在多媒体内容分析、虚拟现实、机器人、病理检测、金融、广告等领域都有着重要的应用价值。
中文关键词: 视觉注意建模;情境线索;深度学习
英文摘要: Visual attention serves as a gatekeeper of the human visual system. Building computational models of the visual attention system is a challenging but important problem in interdisciplinary research including computer science, psychology, cognitive neuroscience, and many other related areas. In order to accurately and efficiently capture the properties of the human visual attention system, our research group started with predicting human eye-movement scanpaths, and then proposed a complete framework on computational models of visual attention. More importantly, we incorporated contextual cues into the pool of impacting factors, and built a model that successfully combined deep learning algorithm and contextual cue information. This model uses deep learning algorithm to generate a multi-layer semantic representation of the context cues, which then turns back to serve as a regularization factor to guide the deep learning process. On the long run, the model will be able to generate a computational visual attention algorithm that includes multi-layer semantic information. This project will contribute to the development of the theoretical framework of computational models on visual attention, and improve computer's ability to extract, represent, understand, and recognize important visual features. By exploring the algorithm and mechanism of human visual system, this project will provide support to brain science, especially visual cognitive science, at the algorithm level. It will have significant impact on the application of multimedia content analysis, virtual reality, robotics, pathology, financial science, and advertisement.
英文关键词: Visual attention modeling;Context cueing;Deep learning