The handwritten text recognition problem is widely studied by the researchers of computer vision community due to its scope of improvement and applicability to daily lives, It is a sub-domain of pattern recognition. Due to advancement of computational power of computers since last few decades neural networks based systems heavily contributed towards providing the state-of-the-art handwritten text recognizers. In the same direction, we have taken two state-of-the art neural networks systems and merged the attention mechanism with it. The attention technique has been widely used in the domain of neural machine translations and automatic speech recognition and now is being implemented in text recognition domain. In this study, we are able to achieve 4.15% character error rate and 9.72% word error rate on IAM dataset, 7.07% character error rate and 16.14% word error rate on GW dataset after merging the attention and word beam search decoder with existing Flor et al. architecture. To analyse further, we have also used system similar to Shi et al. neural network system with greedy decoder and observed 23.27% improvement in character error rate from the base model.
翻译:计算机视觉界的研究人员广泛研究手写文本识别问题,因为其改进范围和对日常生活的可适用性,这是一个模式识别的子领域。由于计算机计算能力提高,自过去几十年以来,以神经网络为基础的系统对提供最先进的手写文本识别器作出了重大贡献。在同一方向上,我们采用了两个最先进的艺术神经网络系统,并将关注机制与它合并在一起。注意技术已被广泛用于神经机器翻译和自动语音识别领域,目前正在文本识别领域实施。在这项研究中,我们能够在IAM数据集实现4.15%字符错误率和9.72%字错误率,在将注意力和字词搜索解码器与现有的Flor et al. 结构相结合后,在GW 数据集上实现7.07 %字符错误率和16.14%字错误率。为了进一步分析,我们还使用了类似于Shi et al. 神经网络系统的系统,并用贪婪的解码器和观察到基模型的字符错误率提高了23.27%。