项目名称: 基于深度神经网络的噪声鲁棒性语音识别方法研究
项目编号: No.61305002
项目类型: 青年科学基金项目
立项/批准年度: 2014
项目学科: 自动化技术、计算机技术
项目作者: 杜俊
作者单位: 中国科学技术大学
项目金额: 25万元
中文摘要: 提高语音识别系统在实际环境下的噪声鲁棒性是语音识别实用化的关键难点和研究热点之一。但由于语音和噪声信号的统计特性都极其复杂,而传统噪声鲁棒性方法为了方便理论推导又作了诸多假设,从而很大程度上限制了识别性能的提高,并且不同方法之间的互补优势也无法很好结合。随着深度神经网络(DNN)在大词汇量连续语音识别声学建模中的成功应用,DNN结合噪声鲁棒性问题的研究,将有望弥补传统噪声鲁棒性方法的缺陷,并带来突破性进展。本项目旨在充分利用DNN强大的非线性建模能力,一方面将DNN用于前端特征提取,比如学习带噪语音和"干净语音"之间的映射关系;另一方面将DNN用于后端声学建模,比如使用Hierarchical DNN将不同前端算法加以融合;此外前后端两个DNN还可以联合优化,以期最大程度的提高噪声环境下语音识别的性能。同时,本项目部分研究成果对语音增强等信号处理领域的基础问题也具有重要意义。
中文关键词: 语音识别;深度神经网络;噪声鲁棒性;预处理;后处理
英文摘要: Improving the noise robustness of automatic speech recogntion (ASR) system in real scenarios is one of the key challenges and hot topics for the application of speech recognition. But as the statistical properties of both speech and noise signals are extremely complicated, and many assumptions are made for convenient theorectical derivation in traditional noise-robust methods, the improvement of recognition performance is limited to some extent, and the advantages of different methods can not be combined properly. With the successful application of deep neural network (DNN) for the acoustic modeling of large vocabulary continuous speech recognition (LVCSR), the research on DNN for noise robustness is expected to make up the defects of traditional noise-robust methods and bring the breakthrough. This project aims to fully exploiting powerful capability of DNN for nonlinear modeling. On the one hand, DNN is used in front-end for feature extraction, e.g., to learn the mapping function between the noisy speech and "clean speech". On the other hand, DNN is used for acoustic modeling in back-end, e.g., to combine different front-end algorithms by using Hierarchical DNN. Besides, two DNNs of both front-end and back-end can also be concatenated for joint optimization. Hopefully it can further improve the recognition per
英文关键词: speech recognition;deep neural network;noise robustness;pre-processing;post-processing