项目名称: 汉藏双语个性化多语种语音合成中的语言建模的研究
项目编号: No.61263036
项目类型: 地区科学基金项目
立项/批准年度: 2013
项目学科: 自动化技术、计算机技术
项目作者: 杨鸿武
作者单位: 西北师范大学
项目金额: 45万元
中文摘要: 多语种语音合成技术能够合成同一说话人说不同语言的语音,是多种语言语音处理中的重要研究内容。由于多语种语音合成的研究与语言密切相关,现有的研究主要针对语音合成技术比较成熟的汉语、日语、英语等语言展开,缺乏面向汉语普通话、少数民族语言以及方言的多语种语音合成的研究。针对此不足,本项目以普通话、藏族地区主要使用的藏语和甘肃省的兰银官话方言为对象,展开多语种语音合成的研究。通过分析多语种语音合成中语言之间的异同,建立语言独立的声学模型,利用语言自适应变换,获得目标语言模型;通过分析说话人说不同语言时的语音特征,建立表征说话人语音个性特色的特征音空间,并引入到说话人自适应变换中;利用统计参数语音合成技术,实现有个性特色的普通话、藏语和兰银官话的多语种语音合成。本项目能丰富藏语、兰银官话的语音处理研究和多语种语音合成的研究,促进藏族地区和甘肃省的语言信息处理研究的发展,有重要的理论意义和应用价值。
中文关键词: 跨语言语音合成;汉藏双语语音合成;汉藏双语情感语音合成;手语到普通话/藏语语音转换;藏语可视语音合成
英文摘要: Polyglot speech synthesis, which can synthesize the speeches of different languages with the same speaker's voice, is a distinct field of research in multilingual speech processing. Because polyglot speech synthesis is closely related to languages, state-of-art researches are focusing on the languages such as English, Chinese and Japanese which have successfully developed speech synthesis technology, and there are lack of researches on polyglot speech synthesis for synthesizing mixed languages of Mandarin, Chinese minority languages such as Tibetan and Chinese dialects such as Lan-yin Mandarin Dialect due to the differences between different languages. In order to solve the above deficiencies, the proposal will focus on the polyglot speech synthesis of Mandarin, Tibetan which is the major minority language in Tibetan district and Gansu province, as well as Lan-yin Mandarin Dialect which is the major dialect in Gansu province. A set of language independent models will be trained by analyzing the similarities and differences between Madarin,Tibetan and Lan-yin Mandarin dialect. The target language models will be obtained from the language independent models by the language adaptation transformation. At the same time, an eigenvoice space will be learned by selecting the principal components of the voice charact
英文关键词: cross lingual speech synthesis;Mandarin-Tibetan bilingual speech synthesis;Mandarin-Tibetan emotional speech synthesis;gesture-to-Mandarin/Tibetan speech conversion;Tibetan visual speech synthesis