使用多类型属性打开中文字符识别 (Open Set Chinese Character Recognition using Multi-typed Attributes)

Recognition of Off-line Chinese characters is still a challenging problem, especially in historical documents, not only in the number of classes extremely large in comparison to contemporary image retrieval methods, but also new unseen classes can be expected under open learning conditions (even for CNN). Chinese character recognition with zero or a few training samples is a difficult problem and has not been studied yet. In this paper, we propose a new Chinese character recognition method by multi-type attributes, which are based on pronunciation, structure and radicals of Chinese characters, applied to character recognition in historical books. This intermediate attribute code has a strong advantage over the common `one-hot' class representation because it allows for understanding complex and unseen patterns symbolically using attributes. First, each character is represented by four groups of attribute types to cover a wide range of character possibilities: Pinyin label, layout structure, number of strokes, three different input methods such as Cangjie, Zhengma and Wubi, as well as a four-corner encoding method. A convolutional neural network (CNN) is trained to learn these attributes. Subsequently, characters can be easily recognized by these attributes using a distance metric and a complete lexicon that is encoded in attribute space. We evaluate the proposed method on two open data sets: printed Chinese character recognition for zero-shot learning, historical characters for few-shot learning and a closed set: handwritten Chinese characters. Experimental results show a good general classification of seen classes but also a very promising generalization ability to unseen characters.

翻译：对中国脱线字符的承认仍然是一个具有挑战性的问题,特别是在历史文献中,不仅在与当代图像检索方法相比非常庞大的班级数量上,而且在开放的学习条件下(甚至对CNN而言)可以预期新的隐形班级数量上都是非常巨大的问题。中国对零或少数培训样本的特征识别是一个困难的问题,而且尚未研究过。在本论文中,我们提出了一种基于中国字符发音、结构和激进的多类型属性的新中国字符识别方法,该方法适用于历史书籍中的字符识别。这一中间属性代码对于普通的“一热”类代表具有很强的优势,因为它可以象征性地理解复杂和看不见的模式。首先,每个字符由四组属性类型代表,涵盖广泛的字符可能性:Pinyanin标签、布局结构、中风次数、三种不同的输入方法,如Cangjie、Zhengma和Wubi等,以及一种四角调的编码编码方法。一个革命性神经网络(CNN)能够学习这些属性。随后,字符可以很容易地被这些属性的能力和无形的形态类型代表着一种不固定的清晰的直观的数学,我们所理解的直观的直径。一个用于对中国的直观的直观的直观的直观的直观的直观数据。一种直观的直径校。