基于稀疏编码的语音特征增强方法研究

项目名称： 基于稀疏编码的语音特征增强方法研究

项目编号： No.61305001

项目类型： 青年科学基金项目

立项/批准年度： 2014

项目学科： 自动化技术、计算机技术

项目作者： 何勇军

作者单位： 哈尔滨理工大学

项目金额： 25万元

中文摘要： 目前的语音识别系统在理想环境下具有较高的识别率，但当存在环境噪声时，其性能将急剧下降，这严重限制了语音识别技术的广泛应用。为了解决这一问题，本项目拟基于稀疏编码的基本理论和方法，研究语音特征增强的有效方法，以提高语音识别系统的噪声鲁棒性。稀疏编码在稀疏性准则下表示信号，不对噪声作平稳性假设，符合人类听觉系统处理信息的特点，为语音特征增强提供了新途径。本课题围绕稀疏编码中的字典构建、稀疏分解和信号重构这三个基本问题展开研究。在字典构建方面，研究合理的字典评价、优化和更新策略；在稀疏分解方面，研究考虑时间相关性的分解算法以及适应时变噪声的参数设置方法；在重构方面研究利用先验知识的动态重构算法和错误原子的动态屏蔽策略；最后研究基于增强频谱的语音特征提取方法。本项目的研究对提高语音识别系统的噪声鲁棒性，进而推动其走向现实应用具有重要的理论意义和实用价值。

中文关键词： 特征提取；稀疏编码；噪声鲁棒性；语音识别；

英文摘要： Although current speech recognition systems can achieve high accuracy rates, their performances are degraded severely under noisy environments, which prevents speech recognition from real applications. To solve this problem, we study speech feature enhancement methods based on the elemental theory and technique of sparse coding to improve the noise robustness of speech recognition systems. Sparse coding represents signals under the rule of sparsity without stationarity assumption on noise, which is in according with the signal processing way of human beings and provids a new way to speech feature enhancement. This research focuses on the three basical aspects of sparse coding, namely dictioanry chosing, sparse decomposition and reconstrction. In dictionary chosing, we propose reasonable evaluation strategies and noise dictionary updating methods; in sparse decompostion, we make use of time relativity of speech and noise, and set the parameters of decompostion methods in a dynamical manner; in reconstruction, we focus on exploiting the prior knowledge of speech and noise and proposing dynamical reconstruction methods to remove wrong atoms. Finnaly, we study feature extraction based on the enhanced speech spectrum. This research has important theoretical significance and practical value in improving the noise robu

英文关键词： feature extraction；sparse coding；noise robustness；speech recognition；

成为VIP会员查看完整内容