Detection and recognition of text from scans and other images, commonly denoted as Optical Character Recognition (OCR), is a widely used form of automated document processing with a number of methods available. Yet OCR systems still do not achieve 100% accuracy, requiring human corrections in applications where correct readout is essential. Advances in machine learning enabled even more challenging scenarios of text detection and recognition "in-the-wild" - such as detecting text on objects from photographs of complex scenes. While the state-of-the-art methods for in-the-wild text recognition are typically evaluated on complex scenes, their performance in the domain of documents is typically not published, and a comprehensive comparison with methods for document OCR is missing. This paper compares several methods designed for in-the-wild text recognition and for document text recognition, and provides their evaluation on the domain of structured documents. The results suggest that state-of-the-art methods originally proposed for in-the-wild text detection also achieve competitive results on document text detection, outperforming available OCR methods. We argue that the application of document OCR should not be omitted in evaluation of text detection and recognition methods.
翻译:扫描和其他图像通常称为光学字符识别(OCR),对扫描和其他图像的文字的探测和识别,通常称为光学字符识别(OCR),是一种广泛使用的自动化文件处理形式,有多种可用的方法可供使用,然而,OCR系统仍然不能达到100%的准确性,要求在正确读出至关重要的应用程序中进行人性更正。机器学习的进展使得对“一成不变”的文字探测和识别(例如从复杂场景的照片中探测物体的文字)的更具挑战性的情况更为明显。虽然通常在复杂的场景上评估在文字识别方面最先进的文本识别方法,但文件在文件域的性能通常没有公布,而且没有与文件OCR的方法进行全面比较。本文比较了为在网上识别文本识别和文件文本识别设计的若干方法,并对结构化文件领域进行了评价。研究结果表明,最初为在网上检测而提出的最先进的方法也能够在文件文本检测方面取得竞争性的结果,优于现有的OCR方法。我们说,在对文本检测和识别方法的评价中不应忽略文件OCR的应用。