项目名称: 语音缺失频谱重建及语音频谱二维相关性建模的研究
项目编号: No.61271426
项目类型: 面上项目
立项/批准年度: 2013
项目学科: 无线电电子学、电信技术
项目作者: 应冬文
作者单位: 中国科学院声学研究所
项目金额: 75万元
中文摘要: 随着语音技术在移动互联网中的迅速普及,语音系统的抗噪能力显得尤为重要,传统的增强和特征/模型补偿的方法难以取得进展,业界近年将研究焦点转移到语音缺失频谱重建。利用频谱在时间和频率两个维度上的相关性,把缺失部分表达为可靠部分的函数,该理论的关键在于频谱相关性建模。频谱的相关性是由时间和频率维度上的相关性交织而成的,而经典的建模方法往往只考虑一个维度的相关性;尽管近年兴起的稀疏方法考虑了两个维度,但复杂度高,实用价值不大。本研究试图采用马尔可夫随机场和快速分层稀疏,分别构建语音频谱的二维相关性模型,使两个维度上的相关性形成内在关联,并建立模型估计缺失频点与可靠频点间的相关性强弱,作为优化重建算法的依据。频谱的相关性建模方法不仅适用于频谱重建,它对于语音增强、特征补偿等其他领域也具有借鉴意义。
中文关键词: 缺失频谱重构;时频相关性;语音增强;语音分离;声源定位
英文摘要: With the rapid development of mobile internet, the robustness of speech processing systems has become increasingly important. It's quite difficult to make breakthrough progress in speech enhancement or feature/model compensation. Recently, more attentions are paid to reconstruction of missing spectrographic data. By utilizing the correlation in the time and frequency dimensions, the unreliable spectrographic data is represented as the function of the reliable data. As the time` correlation is interweaved with the frequency one, one-dimension models are incapable to describe the spectrographic correlation. Although the sparse representation method is based on the two-dimension correlation, its high computational complexity is unacceptable. This research tries to model the speech spectrographic correlation by respectively using the Markov random field and the fast hierarchical sparse representation algorithms, and makes the two correlations inner-related with each other. Moreover, a model is proposed to estimate correlation coefficients between reliable and unreliable data, which makes the correlation be fully utilized. This modeling method can be further extended to other areas such as speech enhancement or feature compensation.
英文关键词: speech signal processing;spectrographic correlation;speech enhancement;speech separation;sound source localization