项目名称: 联机手写维吾尔文基础数据库及识别方法研究
项目编号: No.61462088
项目类型: 地区科学基金项目
立项/批准年度: 2015
项目学科: 其他
项目作者: 齐向伟
作者单位: 新疆师范大学
项目金额: 47万元
中文摘要: 随着计算机识别技术和移动应用的发展,国内外的手写输入软件迅速占据了市场。然而,维吾尔语作为新疆维吾尔自治区的官方语言之一,目前手写技术的发展仍然滞后,文字输入还停留在键盘编码阶段。本项目在中英文、阿拉伯文等手写识别技术和课题组前期研究基础上,从维吾尔语言文字的结构、构词特点和输入习惯等特征出发,研究黏着性语言对手写输入识别的影响,建立标准无约束的大规模维吾尔文联机手写体样本基础数据库,并以此为基础,研究基于轮廓特征规则的维吾尔文字符粘连体切分和多队列基元合并模型;根据维吾尔语的特点,改进现有LVQ神经网络识别器,研究基于字符基元分解和自适应融合的基元识别算法,通过分类器并联实现维吾尔文字符的高效识别;初步研究统计与规则相结合的维吾尔文识别后处理及语言模型的建立技术。希望通过本项目的研究,为新疆多语种信息技术发展以及自治区向西辐射开放的战略规划做出一些贡献。
中文关键词: 维吾尔文;联机手写识别;手写体数据库;切分与识别;识别后处理
英文摘要: With the development of computer recognition technology and mobile application, the handwriting input software both at home and abroad dominates the market rapidly. Although Uighur is one of the official languages in Xinjiang, its Handwriting recognition technology is still lagging behind and the text input system still stays in the phase of keyboard coding. On the basis of Chinese-English, Arabic and other handwriting recognition technology and the previous studies, this project starts from the structure of Uyghur language and script, characteristics of word-formation and input habits etc., to study the influence of adhesion language on handwriting input recognition so as to establish a large-scale and standard underlying database for the samples of online handwritten form of Uighur. And then, study the model of segmentation and multi-queue elements' consolidation for Uighur characters with regular outline. According to the features of Uighur, improving neutral network recognition device of LVQ, combining the primitive recognition algorithm based on the decomposition of character element and self-adaptive fusion, we can realize the high efficient recognition of Uighur by linking categorizer, and we can get the crucial techniques such as disposal after identification combining initial statistics with rule and establishment of language models. Through this research, we hope to make contributions to the development of multilingual information technology in Xinjiang and the strategy of opening up to the west of our autonomous region.
英文关键词: Uighur;online handwriting recognition;handwritten database;segmentation and recognition;processing after recognition