Detection and recognition of text from scans and other images, commonly denoted as Optical Character Recognition (OCR), is a widely used form of automated document processing with a number of methods available. Advances in machine learning enabled even more challenging scenarios of text detection and recognition "in-the-wild" - such as detecting text on objects from photographs of complex scenes. While the state-of-the-art methods for in-the-wild text recognition are typically evaluated on complex scenes, their performance in the domain of documents has not been published. This paper compares several methods designed for in-the-wild text recognition and for document text recognition, and provides their evaluation on the domain of structured documents. The results suggest that state-of-the-art methods originally proposed for in-the-wild text detection also achieve excellent results on document text detection, outperforming available OCR methods. We argue that the application of document OCR should not be omitted in evaluation of text detection and recognition methods.
翻译:扫描和其他图像通常称为光学字符识别(OCR),探测和识别来自扫描和其他图像的文字,通常被称为光学字符识别(OCR),是一种广泛使用的自动化文件处理形式,采用多种可用方法。机器学习的进展使得发现和识别“在微博中”的文字更加具有挑战性,例如从复杂场景的照片中探测物体的文字。虽然通常在复杂的场景中评价最先进的在微博中识别文字的方法,但在文件领域的表现尚未公布。本文比较了为在微博中识别文本和文件文本识别而设计的若干方法,并提供了对结构化文件领域的评价。结果显示,最初为在微博中探测而提出的最先进的方法在文件文本检测、优异于现有OCR方法方面也取得了极好的结果。我们说,在评价文本检测和识别方法时,不应忽略对OCR文件的应用。