项目名称: 面向语音表示及分离的结构化深度学习研究
项目编号: No.61471394
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 无线电电子学、电信技术
项目作者: 张雄伟
作者单位: 中国人民解放军陆军工程大学
项目金额: 80万元
中文摘要: 语音信号存在着大量的可变因素,例如不同说话人、说话语气、背景噪声、其他说话人的声音、回声等。人类的听觉感知系统可以轻易过滤掉干扰信息,并提取出有用信息,对语音的表现形式和环境的变化具有良好的适应性。深度学习模拟人脑对感知信息的处理过程,该方法为语音的表示和分离提供了新的思路。本课题以深度学习的理论和算法为基础,针对语音信号的表示和分离问题,通过研究和改进结构化深度信念网络模型,突破训练过程中的模型拓扑结构不确定、运算复杂度高、优化问题非凸等关键难点,获取语音信号更好的层次化表示,实现不同信源以及噪声的分离,为后续语音处理任务提供更好的前端模型。
中文关键词: 深度学习;马尔科夫蒙特卡洛抽样;语音表示;结构化学习;语音分离
英文摘要: There are a lot of variations in speech signals, such as different speakers, various emotions, miscellaneous background noise and reverberations. However, the hearing system of human is able to be adapted to those variations in a smart way by filtering out irrelevant noise towards useful target information. Deep learning simulates the information processing in human brain. This provides us a novel approach to speech representation and seperation. In this project, deep learning is deployed and improved towards better solutions of speech representation and separation. The key steps are to investigate structured deep belief networks, to determine the network topology adaptively, to overcome the high computational complexity and to alleviate the non-convexity in optimization. By separating different speech sources and noise, better acoustic models can be obtained for the subsequent speech processing tasks.
英文关键词: deep learning;MCMC sampling;speech representation;structured learning;speech separation