多传感信息融合语音分离和虚拟声合成研究

项目名称： 多传感信息融合语音分离和虚拟声合成研究

项目编号： No.61201403

项目类型： 青年科学基金项目

立项/批准年度： 2013

项目学科： 电子学与信息系统

项目作者： 郑成诗

作者单位： 中国科学院声学研究所

项目金额： 24万元

中文摘要： 基于传声器阵列的传统语音增强面临的主要问题是在复杂噪声和强混响环境下性能大幅下降，而且语音信噪比提高往往以丢失空间感为代价。现有的一些基于双耳特性的降噪算法，虽然可以在一定程度上保留语音空间感，但是仅适用于双耳助听器。本项目研究基于多传感信息融合的语音分离和虚拟声合成，相比于传统方法具有两方面的优势：一方面结合主动式和被动式传感器进行说话人定位和话音活动检测，可以提高复杂环境下的语音提取和分离性能；另一方面利用远场和近场头相关传输函数已有的测量数据库合成虚拟声，可以实现具有空间感和临场感的语音效果。本项目拟对多传感信息融合的话音活动检测、多传感信息融合的语音分离以及基于头相关传输函数的虚拟声合成三方面的理论和技术进行深入的研究。该研究的理论突破将极大的提高语音增强在复杂环境下的降噪和去混响性能，并为用户提供更好的浸入式远程呈现方式。

中文关键词： 信息融合；语音分离；虚拟声合成；；

英文摘要： There are two main drawbacks of conventional speech enhancement (SE) algorithms. First, their performance significantly reduces in complicated noise scenarios and highly reverberant environments. Second, the signal-to-noise-ratio (SNR) is improved at the price of the sensation of spatial hearing. Some algorithms have been proposed to preserve binarual cues in recent years. Although these algorithms could preserve some binarual cues, they could only be applied to hearing aids. This project proposes multiple sensor information fusion (MSIF)-based speech separation and virtual sound synthesis to solve the drawbacks of microphone array (MA)-based SE algorithms. Compared with the MA-based SE algorithms, the proposed method has at least two advantages. First, speaker location and voice activity detector can be more accurately estimated by using both active and passive sensors, which can improve the performance of the speech separation. Second, the spatial sound image is restored by the virtual sound synthesis (VSS), where the separated speech signals are filtered by the measured head related transfer function (HRTF) database. This project will study on both the theory and the method of three aspects, including the MSIF-based voice activity detector, the MSIF-based speech separation and the HRTF-based VSS. By this stud

英文关键词： Information fusion；Speech separation；Virtual sound synthesis；；

成为VIP会员查看完整内容