项目名称: 面向电话语音的蒙古语关键词检测技术的研究
项目编号: No.61263037
项目类型: 地区科学基金项目
立项/批准年度: 2013
项目学科: 自动化技术、计算机技术
项目作者: 高光来
作者单位: 内蒙古大学
项目金额: 43万元
中文摘要: 蒙古语是一个跨多国、多地区的语言,在国际上是有广泛影响的一种语言文字,使用者分布在中国、蒙古国和俄罗斯等国家。中国和蒙古国使用的蒙古语言文字是"语同文不同",因此安全战略地位十分突出。另外,蒙古语语音资源应用越来越普遍,数量急剧增加,已形成了宝贵的民族文化资源,有待于进一步开发利用。本项目以蒙古语电话语音为对象,对语音关键词检测技术所涉及到的蒙古语语音识别系统的解码、网格数据优化及索引建立、关键词的检测模型和置信度计算方法、集外词处理、关键词查询扩展、蒙古文字母到音素的自动转换等一系列关键问题进行研究,并搭建一个基本能达到应用要求的蒙古语关键词检测系统。我们将借鉴其它语言的先进经验,并结合蒙古语的特点,突破一系列难点来提高系统检测的准确度。本项目研究的蒙古语语音关键词检测技术不仅具有重要的学术价值,并对维护国家安全及边疆少数民族地区的稳定,繁荣和发展少数民族文化具有重要意义。
中文关键词: 蒙古语;关键词检测;语音识别;置信度;集外词
英文摘要: Mongolian language is a kind of influential language in the world, which is used in many countries such as China, Republic of Mongolia and Russia. In China, the used Mongolian is called "traditional Mongolian", which is different from "Cyril Mongolian" used in Republic of Mongolia. These two kinds of Mongolian languages have the same speaking, but they are different in writing. Under the circumstance, the Mongolian language possesses an outstanding position in safety and strategy of our country. In addition, more and more voice resources of Mongolian language have been formed with rapidly increasing, which involves in education, culture, film, television and other fields. These are the precious culture resources for the Mongolian people. And these voice resources need to be developed and utilized further. This project will research all key issues in Mongolian speech keyword spotting. And the Mongolian speech is achieved by telephone, which is the study object in this project. The detailed research contents include the decoding of the Mongolian speech recognition system, lattice optimization, indexing, the keyword spotting model, the confidence measure calculation approach, the processing for words out of vocabulary, query expansion, the conversion of grapheme to phoneme. Finally, a Mongolian keyword spotting sys
英文关键词: Mongolian;Keyword spotting;Speech recognition;Confidence measure;Out of vocabulary