Geometric rectification of images of distorted documents finds wide applications in document digitization and Optical Character Recognition (OCR). Although smoothly curved deformations have been widely investigated by many works, the most challenging distortions, e.g. complex creases and large foldings, have not been studied in particular. The performance of existing approaches, when applied to largely creased or folded documents, is far from satisfying, leaving substantial room for improvement. To tackle this task, knowledge about document rectification should be incorporated into the computation, among which the developability of 3D document models and particular textural features in the images, such as straight lines, are the most essential ones. For this purpose, we propose a general framework of document image rectification in which a computational isometric mapping model is utilized for expressing a 3D document model and its flattening in the plane. Based on this framework, both model developability and textural features are considered in the computation. The experiments and comparisons to the state-of-the-art approaches demonstrated the effectiveness and outstanding performance of the proposed method. Our method is also flexible in that the rectification results can be enhanced by any other methods that extract high-quality feature lines in the images.
翻译:在文件数字化和光学字符识别(OCR)中,扭曲文件的图像的几何校正发现在文件数字化和光学特征识别(OCR)中应用得非常广泛。虽然许多作品都广泛调查了平稳的曲线变形,但最具有挑战性的扭曲,例如复杂的折痕和大的折叠,尤其没有研究。现有方法的性能,如果应用于基本折叠或折叠的文档,远远不能令人满意,因此有很大的改进空间。为了完成这项任务,关于文件变校正的知识应纳入计算,其中3D文件模型的可发展性和图像中特定的文字特征,如直线,是最重要的特征。为此目的,我们提出了一个文件图像校正总框架,其中利用一个计算性对称绘图模型来表达3D文件模型及其在平面上的平整。根据这个框架,在计算中考虑了模型的可开发性和质性特征。对最新方法的实验和比较表明拟议方法的有效性和突出性。我们的方法也是灵活的,通过其他方法改进高品质的图象,可以提高质量。我们的方法是灵活的。