Handwriting recognition is of crucial importance to both Human Computer Interaction (HCI) and paperwork digitization. In the general field of Optical Character Recognition (OCR), handwritten Chinese character recognition faces tremendous challenges due to the enormously large character sets and the amazing diversity of writing styles. Learning an appropriate distance metric to measure the difference between data inputs is the foundation of accurate handwritten character recognition. Existing distance metric learning approaches either produce unacceptable error rates, or provide little interpretability in the results. In this paper, we propose an interpretable distance metric learning approach for handwritten Chinese character recognition. The learned metric is a linear combination of intelligible base metrics, and thus provides meaningful insights to ordinary users. Our experimental results on a benchmark dataset demonstrate the superior efficiency, accuracy and interpretability of our proposed approach.
翻译:手写识别对人类计算机互动(HCI)和书面数字化都至关重要。在光学字符识别(OCR)的一般领域,手写中文字符识别面临巨大的挑战,因为字符群巨大,写作风格非常多样。学习适当的距离测量数据输入之间的差异是准确手写字符识别的基础。现有的远程远程学习方法要么产生不可接受的错误率,要么在结果中提供很少的解释性。在本文中,我们提出了手写中文字符识别的可解释的远程学习方法。学习的尺度是可理解的基本尺度的线性组合,因此为普通用户提供了有意义的洞察力。我们在基准数据集上的实验结果显示了我们拟议方法的更高效率、准确性和可解释性。