项目名称: 基于图模型的场景文字与叠加文字提取识别技术研究
项目编号: No.61271434
项目类型: 面上项目
立项/批准年度: 2013
项目学科: 无线电电子学、电信技术
项目作者: 王伟强
作者单位: 中国科学院大学
项目金额: 76万元
中文摘要: 准确提取识别图片视频中的图形像素文字具有重要的研究意义与广阔应用前景。本项目将系统地研究涉及的各种关键技术,包括场景文字的定位、分割、矫正、复杂背景中叠加文字的定位、分割,以及非理想分割状况下的字符识别,并注重一般性理论的拓广创新。具体的研究问题包括:基于图模型的一般场景文字检测算法;先验知识导向下低分辨率、复杂光照条件下场景文字的检测方法;将边缘检测与区域分割融为一体的高效分割技术;对于发生透视变形的场景文字,基于多种线索的视图矫正计算方法;可同时提取叠加文字与场景文字的统一方法;基于冗余多叉树与图模型求解带噪声的大数目类别的识别模型。本项目的研究内容不仅与实际应用紧密相关,同时项目潜在的研究成果对丰富目标检测、对象分割、机器学习等基础理论也具有重要价值。
中文关键词: 叠加文字;场景文字;文字识别;显著性;深度神经网络
英文摘要: Accurately extracting and recognizing scene text and overlaid text in images and videos means a lot for computers and can be widely applied in many applications. In this project, we will systematically conduct reseaches on varous key techniques involved,including the localization, segmentation, rectification of scene text, the localization,segmentation of overlaid text, as well as the recogniton of characters imperfectly segmented out, and we will emphasize the theory innovation. The concrete research topics include: (1) the approach to detecting scene text based on graph model;(2)the techniques of detecting low-resolution scene text under complex lighting conditions if some priori knowledge is available;(3)an effective technique to implement edge detection and region segmentation in one body; (4) the rectification techques of scene text which has been distorted by the projective transform of cameras;(5)an unified approach to extracting both scene text and embedded text; (6)the approach to constructing recognition system with noisy input and a large number of class labels as ouput by redundant n-fork trees and graph model.The research topics of the project are tightly related with practical applications, and at the same time the potential research results are very valuable for enriching the fundamental theory
英文关键词: overlaid text;scene text;text recognition;saliency;deep neural networks