项目名称: 结合前馈和反馈机制的自然场景文本识别技术
项目编号: No.61473036
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 其他
项目作者: 殷绪成
作者单位: 北京科技大学
项目金额: 83万元
中文摘要: 自然场景文本识别(end-to-end scene text recognition),是人机交互、图像理解、视频检索等应用的重要技术手段。当前技术主要把文本检测、分割和识别等过程分离开来进行研究,而且具有非常有限的整体识别性能。本课题研究创新性结合前馈和反馈机制的端到端场景文本识别技术。首先,提出融合特征集成的深度神经网络架构,研究高效的场景字符分类器及词识别技术;其次,提出基于图像特征和识别输出综合学习的反馈技术,研究有效的场景文本识别信息反馈方法;第三,引入网络基序正反馈环前馈模式,提出全新的端到端场景文本识别信息前馈与反馈整体机制;最后,基于我们世界领先的自然场景文本检测与分割技术,并结合前述创新方法,构建世界领先水平的端到端场景文本识别技术。本课题的研究成果在文字识别、模式识别、机器学习、图像检索等方面具有较大的理论意义和重要的实用价值。
中文关键词: 文本识别;文本检测;前馈;反馈;自然场景
英文摘要: End-to-end scene text recognition has important applications in human-computer interaction, image understanding, video retrieval etc. Currently, most researchers investigate text detection, segmentation, and recognition separately in the end-to-end system with a very limited performance. On the contrast, our project focuses on the end-to-end scene text recognition system by combining feedforward and feedback simultaneously. First, we propose a deep neural network framework with feature fusion, and construct efficient character and word classifiers. Second, we propose a feedback learning algorithm with vision features and classifiers' outputs, and investigate adaptive feedback strategies for scene text recognition. Third, based on the concept of Feed Forward Loop in Network Motif, we propose a new and whole strategy with feedforward and feedback in the end-to-end recognition system. Finally, based on our leading text detection technology and followed with the above novel methods, we construct a world-first-level-class system for end-to-end text recognition in natural scene images. The achievements of this project will include several important novel theories and technologies in character recognition, pattern recognition, machine learning, and image retireval.
英文关键词: text recognition;text detection;feedforward;feedback;natural scene