用于改进光学字符识别性能的未知框相近性 (Unknown-box Approximation to Improve Optical Character Recognition Performance)

Optical character recognition (OCR) is a widely used pattern recognition application in numerous domains. There are several feature-rich, general-purpose OCR solutions available for consumers, which can provide moderate to excellent accuracy levels. However, accuracy can diminish with difficult and uncommon document domains. Preprocessing of document images can be used to minimize the effect of domain shift. In this paper, a novel approach is presented for creating a customized preprocessor for a given OCR engine. Unlike the previous OCR agnostic preprocessing techniques, the proposed approach approximates the gradient of a particular OCR engine to train a preprocessor module. Experiments with two datasets and two OCR engines show that the presented preprocessor is able to improve the accuracy of the OCR up to 46% from the baseline by applying pixel-level manipulations to the document image. The implementation of the proposed method and the enhanced public datasets are available for download.

翻译：光学字符识别(OCR)是许多领域广泛使用的模式识别应用。消费者可以使用几种具有地貌特性的通用的OCR解决方案,这些解决方案可以提供中度至极佳的精确度。然而,精确度会随着困难和不寻常的文件域而降低。文件图像的预处理可以用来最大限度地减少域转移的影响。在本文中,提出了为特定光化光化字符识别引擎创建定制预处理器的新办法。与先前的OCR随机预处理技术不同,拟议办法接近用于培训预处理模块的特定OCR引擎的梯度。用两个数据集和两个OCR引擎进行的实验显示,所提出的预处理器能够通过对文件图像应用像素级操作,从基线上提高OCR的准确度,达到46%。可以下载拟议的方法和强化的公共数据集。

相关内容

光学字符识别

关注 44

OCR （Optical Character Recognition，光学字符识别）是指电子设备（例如扫描仪或数码相机）检查纸上打印的字符，通过检测暗、亮的模式确定其形状，然后用字符识别方法将形状翻译成计算机文字的过程；即，针对印刷体字符，采用光学的方式将纸质文档中的文字转换成为黑白点阵的图像文件，并通过识别软件将图像中的文字转换成文本格式，供文字处理软件进一步编辑加工的技术。

现代优化理论与应用

专知会员服务

89+阅读 · 2020年8月2日

现代深度学习技术在自然语言处理的应用（Modern Deep Learning Techniques Applied to Natural Language Processing）

专知会员服务

53+阅读 · 2020年4月7日

【ICML2020投稿论文】用于半监督图像分类的CowMask，Milking CowMask for Semi-Supervised Image Classification

专知会员服务

29+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日