项目名称: 基于视觉上下文与文字显著性的复杂自然场景中文字检测研究
项目编号: No.61502164
项目类型: 青年科学基金项目
立项/批准年度: 2016
项目学科: 其他
项目作者: 王润民
作者单位: 湖南师范大学
项目金额: 20万元
中文摘要: 自然场景文字检测是实现图像检索、智能交通以及移动导盲等应用的重要技术手段。自然场景文字字体、大小、颜色和排列方式的多样性,以及光照变化、复杂背景、噪声干扰等因素给文字检测带来了极大的挑战。当前技术主要采用手工设计的特征来分类文字区域与背景区域,孤立地识别文字却忽略了相邻文字上下文信息,从而影响了算法检测性能。本课题基于视觉上下文与文字显著性研究自然场景文字检测方法,主要内容包括:(1)结合图割与最大稳定极值区域方法以改善文字连通区域提取结果;(2)融合手工设计的特征与深度学习获得的特征,研究具有高分类性能的文字特征以及文字识别技术;(3)结合图像底层感知内容、文字高层信息以及视觉上下文信息,设计自然场景文字显著性模型;(4)基于我们前期研究工作并结合上述创新方法,构建一套高效可行的自然场景文字检测系统。本课题的研究成果在文字识别、模式分类、机器学习等方面具有重要的理论意义与广泛的实用价值。
中文关键词: 自然场景图像;文字检测;视觉上下文;显著性检测;描述性中层块
英文摘要: Natural scene text detection is an important technology in image retrieval, intelligent transportation and visually impaired person guiding, etc. However, natural scene text detection from images is a challenging problem due to the variability of text font, size, color, arrangement orientation as well as lighting changes, complex background and noise interference, etc. Currently, most researchers characterize the text candidate regions by using the hand crafted features, and they separately identify the text candidate regions while ignoring the context information between the adjacent text candidates, which affect the detection performance. On the contrast, our project focuses on combining the visual context and the text saliency, and innovatively presenting a natural scene text detection solution. Firstly, the natural scene image is binarized by using the graph cut and the Maximally Stable Extremal Regions (MSER) to obtain high-quality text connected components. Secondly, combining the hand crafted features and the deep learning methods to study high identification performance text features and text recognition technology. Thirdly, combining the image underlying perception, text high level information and visual context information to design natural scene text saliency model. Fourthly, based on our previous research works and followed with the aforementioned novel methods, we plan to construct an efficient and feasible system for text detection in the natural scene images. The achievements of this project will include some important theoretical significance and extensive practical value in character recognition, pattern classification, and machine learning.
英文关键词: Natural scene image;text detection;visual context;saliency detection;descriptive mid-level patch