项目名称: 傣语文本分析与语音合成研究
项目编号: No.61262068
项目类型: 地区科学基金项目
立项/批准年度: 2013
项目学科: 自动化技术、计算机技术
项目作者: 杨鉴
作者单位: 云南大学
项目金额: 46万元
中文摘要: 随着语音合成自然度的提高,采用语音合成技术的产品,已得到了广泛应用。在国内,汉语普通话语音合成技术已实现产品化,藏语、维吾尔语等少数民族语的语音合成技术已处于产品化阶段,然而,云南少数民族语的语音合成研究还未得到应有的重视,傣语语音合成研究目前还无人问津。本项目以开发傣语文语转换应用系统为目的,设计并构建傣语语音合成语料库;针对傣语的特征,研究合成基元的选取方法,为HMM声学模型设计用于决策树聚类的上下文属性和问题集,优化语音合成器的训练流程;研究傣语的韵律标注规则、词性标记方法;构建傣语词典,研究傣语句子的预处理方法、分词方法,以及韵律短语预测方法,构建文本分析系统;针对现代傣语中普遍使用汉语借词、英语词汇的现象,研究傣语中外来词的文语转换问题;开发傣语文语转换实时演示系统。本项目将有力促进我国少数民族语言的语音合成研究,并推动语音技术在边疆民族地区的广泛应用。
中文关键词: 傣语;语音合成;文语转换;语音数据库;汉语借词
英文摘要: With the improvement of the natureness of synthesised speech generated by Text-to-Speech system, the products using speech synthesis technology have been widely applied in some engineering applications. The products of Mandarin speech synthesis have been popularized in China, and the speech synthesis technology of minorities' languages such as Tibetan, Uigur has been in the stage of developing product. However, the research of speech synthesis about national minorities' languages in Yunnan has not received due attention. Moreover, there is little research about speech synthesis of Dai language at present. In this project, a trainable text-to-speech system of Dai will be developed, and a Dai corpus for speech synthesis will be designed and constructed. According to the characteristics of Dai language, the following works will be conducted: (1) Selection of the units for speech synthesis, design for context attributes and questions set used for decision tree clustering of HMMs, and optimization of the training process of speech synthesizers. (2) Research on the rules of prosody label and the method of part-of-speech tagging for Dai language. (3) Dictionary construction, research on the methords of sentence pre-processing, word segmentation as well as prediction of prosody phrase break, and then the development of
英文关键词: Dai Language;Speech Synthesis;Text to Speech;Speech Corpus;Chinese loanwords