项目名称: 现代藏文自动校对研究
项目编号: No.61202189
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 计算机科学学科
项目作者: 关白
作者单位: 西藏大学
项目金额: 25万元
中文摘要: 现代藏文自动校对技术研究是藏文信息处理技术中一项具有广阔前景和极具挑战性的研究课题。随着藏文信息处理技术的发展促使现代藏文出版业电子化,藏文网页、藏文电子书、电子报纸、电子邮件及其办公文件等不断涌现,电子文本呈海量增长。使得在使用这些电子文本时,其中的校对环节的工作量大大增加,人工校对的方式已经无法适应电子文本校对,有了自动校对系统就可以通过快捷、简便、准确的对现代藏文文本进行字、词和句法的自动校对,将改变原始、落后、繁重、劳苦的人力校对方式。本项目在借鉴现有中文和英文文本校对技术的基础上,对现代藏文文本自动校对领域中的音节字、词和格助词进行深入研究和分析。充分利用传统藏语语法的理论成果,研究现代藏文文本中音节字、词和格助词的构成方式和搭配规则,结合文本校对的方法和理论分析音节字、词和格助词的错误类型,针对性的提出用于校对现代藏文文本的侦错与纠错方法及算法。
中文关键词: 现代藏文;音节字;格助词;侦错;纠错
英文摘要: Research of Modern Tibetan automatic proofreading technology is a research topic with broad prospects and challenges in Tibetan information processing technology. The development of the Tibetan information processing technology prompted modern Tibetan publishing industry electronic. Tibetan website, Tibetan e-books, e-newspapers, e-mail and office documents, etc are emerging. Electronic text is growing massively ,which greatly increases proofreading. Manual proofreading has been unable to adapt to the electronic text proofreading. With auto-check system, proofreading to modern Tibetan words, phrases and syntax can be fast, easy and accurate. Based on the present Chinese and English proofreading technology, this project will study on syllable words, phrases, and grid particle in Tibetan text automatic proofreading. This project will make full use of the theoretical findings of the traditional Tibetan grammar and study on the formation and matching rules of syllable words, phrases, and grid particle in modern Tibetan text. Combined with text proofreading methods and the analysis of wrong types of syllable words, phrases, and grid particle, it will propose some debug and error correction methods and algorithm on proofreading modern Tibetan text.
英文关键词: Modern Tibetan;syllable words;grid particle;debugging;error correction