Handwritten Text Recognition (HTR) is more interesting and challenging than printed text due to uneven variations in the handwriting style of the writers, content, and time. HTR becomes more challenging for the Indic languages because of (i) multiple characters combined to form conjuncts which increase the number of characters of respective languages, and (ii) near to 100 unique basic Unicode characters in each Indic script. Recently, many recognition methods based on the encoder-decoder framework have been proposed to handle such problems. They still face many challenges, such as image blur and incomplete characters due to varying writing styles and ink density. We argue that most encoder-decoder methods are based on local visual features without explicit global semantic information. In this work, we enhance the performance of Indic handwritten text recognizers using global semantic information. We use a semantic module in an encoder-decoder framework for extracting global semantic information to recognize the Indic handwritten texts. The semantic information is used in both the encoder for supervision and the decoder for initialization. The semantic information is predicted from the word embedding of a pre-trained language model. Extensive experiments demonstrate that the proposed framework achieves state-of-the-art results on handwritten texts of ten Indic languages.
翻译:手写文本识别(HTR)比印刷文本更有趣、更具挑战性,因为作家、内容和时间的笔迹风格差异不一。 HTR对印度语更具挑战性,因为(一) 多个字符加在一起形成连结,增加了各自语言的字符数量,以及(二) 每个印度语脚本中近100个独特的统一代码基本字符。最近,基于编码解码器解码器框架的许多识别方法被提出来处理这类问题。它们仍然面临着许多挑战,例如图像模糊和不完整的字符,因为不同的写法风格和内装密度不同。我们争辩说,大多数编码解码解码方法都基于本地的视觉特征,而没有明确的全球语义信息。在这项工作中,我们利用全球语义识别信息的编码解码-解码框架来提高Indic 手写文本识别器的性能。我们使用一个语义解码解码模块来提取全球语义信息,以识别印手写文字文本。语义信息既用于监管,又用于初始化的解码。我们认为,大多数的解码解码方法方法基于本地的图像。我们预测了隐化文本。