Document image enhancement and binarization methods are often used to improve the accuracy and efficiency of document image analysis tasks such as text recognition. Traditional non-machine-learning methods are constructed on low-level features in an unsupervised manner but have difficulty with binarization on documents with severely degraded backgrounds. Convolutional neural network-based methods focus only on grayscale images and on local textual features. In this paper, we propose a two-stage color document image enhancement and binarization method using generative adversarial neural networks. In the first stage, four color-independent adversarial networks are trained to extract color foreground information from an input image for document image enhancement. In the second stage, two independent adversarial networks with global and local features are trained for image binarization of documents of variable size. For the adversarial neural networks, we formulate loss functions between a discriminator and generators having an encoder-decoder structure. Experimental results show that the proposed method achieves better performance than many classical and state-of-the-art algorithms over the Document Image Binarization Contest (DIBCO) datasets, the LRDE Document Binarization Dataset (LRDE DBD), and our shipping label image dataset.
翻译:常规非机械学习方法以不受监督的方式在低级别特性上建立,但在背景严重退化的文档上难以实现二进制。 以神经神经网络为基础的进化方法仅侧重于灰度图像和本地文本特性。 在本文中,我们建议使用基因化对抗神经网络来提高两个阶段的彩色文档图像增强和二进制方法。 在第一阶段,对四个依赖色的敌对网络进行培训,以便从文件图像输入图像中提取彩色表面信息,用于加强文件图像。 在第二阶段,对两个具有全球和地方特点的独立对抗网络进行培训,以不同大小的文件进行图像双进化。对于对抗性神经网络,我们为带有编码器解析器结构的导体和发电机制定损失功能。实验结果显示,在文件图像Binarizizization DDD(DIBCO)数据集、DRAD DRAD 数据标签和DRAD DRAD 图像集上,我们拟议的方法的性能优于许多古典和状态演算法。