项目名称: 仿视觉感知机制的图像场景语义分类研究
项目编号: No.61303128
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 顾广华
作者单位: 燕山大学
项目金额: 23万元
中文摘要: 在浩如烟海的图像数据资源中,如何使计算机能够模仿人类视觉感知系统获取图像场景语义,对图像进行有效的分类管理与组织,已经成为当前亟待解决的问题。由于场景图像存在着较大的类内对象差异性和类间视觉相似性,图像场景语义分类任务十分困难。针对图像场景语义分类中存在的问题,本项目拟重点研究以下内容:(1)构建多尺度上下文特征,聚类生成视觉词典,每个视觉单词视作一个通道,进行特征映射得到每个通道的特征投影分布,基于信息熵理论进行特征有效性选择;(2)依据不同的分辨率,将图像划分为逐步精细的正则超像素网格,通过融合各子域特征的相对位置关系,建立空间层次化图像语义描述,设计分类器,完成室外场景分类任务;(3)利用图像全局、局部属性信息和图像原型,建立原型星型集群模型,训练图像和场景类标签之间的映射关系,完成室内场景分类任务。本项目研究对图像语义分类、反向图片搜索等都将具有重要的理论价值和广泛的应用前景。
中文关键词: 视觉感知;场景语义分类;图像描述;语义视觉词汇;超像素网格
英文摘要: It has become an urgent problem to solve that how to make our computers obtain the scene semantics of the images by mimicking the human visual perception systems, from the vast image data resources, and then to effectively classified and organized these images. However, as there are intra-class object variations and inter-class visual similarities among the scene images, the task of the image scene semantic classification becomes very hard. According to these questions, this project intends to focus on the following contents: (1) We build the multi-scale contextual features and perform the clustering method to generate a visual dictionary, in which each visual word is regarded as a channel. Then we could get the feature projection distribution of every channel through feature mapping, and further conduct the feature validity choice based on the information entropy theory; (2) Depending on the different resolutions, the image can be divided into gradually refined regular superpixel lattices. Through the integration of relative positional relationship of the sub-region features, we can build the spatial hierarchical image semantic representation and design the classifier to complete the outdoor scene classification tasks; (3) Using the global and local properties of the image, we can build the star-constellation m
英文关键词: Visual perception;Scene semantic classification;Image representation;Semantic visual vocabulary;Superpixel lattices