Optical character recognition (OCR) is a process of converting analogue documents into digital using document images. Currently, many commercial and non-commercial OCR systems exist for both handwritten and printed copies for different languages. Despite this, very few works are available in case of recognising Bengali words. Among them, most of the works focused on OCR of printed Bengali characters. This paper introduces an end-to-end OCR system for Bengali language. The proposed architecture implements an end to end strategy that recognises handwritten Bengali words from handwritten word images. We experiment with popular convolutional neural network (CNN) architectures, including DenseNet, Xception, NASNet, and MobileNet to build the OCR architecture. Further, we experiment with two different recurrent neural networks (RNN) methods, LSTM and GRU. We evaluate the proposed architecture using BanglaWritting dataset, which is a peer-reviewed Bengali handwritten image dataset. The proposed method achieves 0.091 character error rate and 0.273 word error rate performed using DenseNet121 model with GRU recurrent layer.
翻译:光学字符识别(OCR)是一个利用文件图像将模拟文档转换成数字的过程,目前,手写和印刷不同语言的手写和印刷版本都存在许多商业和非商业的OCR系统;尽管如此,在承认孟加拉语词的情况下,很少有作品可供使用;其中多数作品侧重于孟加拉印刷字符的OCR;本文为孟加拉语引入了端到端的OCR系统;拟议架构实施了一个结束战略,承认手写孟加拉语图像中的手写孟加拉语。我们试验了流行的神经网络(CNN)结构,包括DenseNet、Xception、NASNet和移动网络,以建立OCR结构。此外,我们试验了两种不同的经常性神经网络(RNN)方法,即LSTM和GRU。我们用BanglaWriting数据集评估了拟议的结构,这是一个经过同行审查的孟加拉手写图像数据集。拟议方法达到了0.091个字符错误率和0.273个字错误率,使用DenseNet121的GRUODRO。