项目名称: 木刻印刷蒙古文古籍识别与检索技术的研究
项目编号: No.60865003
项目类型: 地区科学基金项目
立项/批准年度: 2009
项目学科: 武器工业
项目作者: 高光来
作者单位: 内蒙古大学
项目金额: 25万元
中文摘要: 木刻印刷蒙古文古籍文献资料,内容涉及宗教、历史、文学、天文、医学等诸多方面,是人类的宝贵文化遗产。这些文献中以清代康熙年间(1720年)在北京木刻印刷的蒙古文《甘珠尔经》最具代表性,仅存世八套。本项目以蒙古文《甘珠尔经》为对象,从文字识别和信息检索的角度出发,系统的研究和解决了木刻印刷蒙古文古籍识别与检索中所涉及的蒙古文字元切分、字元集的确定、字元的特征分析与选择、分类器设计、识别后处理、错误校正、索引项选择等关键问题。在此基础上,开发出一套初步可用的木刻印刷蒙古文古籍识别与检索系统。这些工作对挖掘和利用蒙古文古籍文献资料、传承和发展少数民族文化、促进民族地区的社会发展和科技进步都具有重要意义。
中文关键词: 木刻印刷;蒙古文甘珠尔经;字元切分;多分类器融合;文档图像检索
英文摘要: The ancient Mongolian books and literatures by woodblock printing are precious culture heritages of the human beings. Their content involves in religion, history, literature, astronomy, medicine and so on. The Mongolian Kanjur is the most famous and representative of them. The Mongolian Kanjur was made by woodblock printing in 1720 (Kangxi period of Qing Dynasty). Now, only eight sets of this edition are surviving in all over the world. This project takes the Mongolian Kanjur as the research object. From the view of optical character recognition and information retrieval, each key issue of this project has been studied and solved through our efforts. These key issues include glyphs segmentation, determining glyph set, feature analysis, feature selection, designing classifier, post-processing, error correction, selection of index unit and so forth. Based on the fundamental work, a recognition and retrieval system for the Mongolian Kanjur has been developed. The above-mentioned work has important significance for mining and utilizing the ancient Mongolian literatures. Meanwhile, this project can inherit and develop the ethnic minority culture in Inner Mongolia Autonomous Region. And it is also able to promote the development of the society and the advance of science and technology in the ethnic minority region.
英文关键词: Woodblock printing; Mongolian Kanjur; Glyph segmentation; Multiple classifier combination; Document image retrieval