项目名称: 基于语谱图信息的汉语词汇整体识别和语音增强方法研究
项目编号: No.61471111
项目类型: 面上项目
立项/批准年度: 2015
项目学科: 无线电电子学、电信技术
项目作者: 王双维
作者单位: 东北师范大学
项目金额: 75万元
中文摘要: 无论语音识别还是语音增强,常规语音处理技术通常利用语音信号属于非平稳随机过程这一特性,以10-30ms的短时语音帧为基本单位进行处理。但这种分割方法破坏了音节承载信息的整体性,在一定程度上影响了语音处理的效果。本项目拟以语谱图解析为信息平台,系统研究特定人汉语语音词汇整体识别、单字声调识别、不同说话人同语义语谱图转换,及语音增强方法,并形成相应的基本算法体系。该项目成果有利于实现汉语语音字、词、句的整体识别,提高汉语识别效率;对汉语单字发音的声调识别,为汉语情感识别和汉语方言识别提供基础;利用几何变换,对同语义不同说话人语谱图的相互转换,可以达到非特定人语义单模版识别的目的;采用语谱图为信息平台,可以使得音频样本中相同频域区间的信号与噪声,在图像频域中实现信噪频位分离,大大提高语音增强效果。
中文关键词: 声信号处理;语音识别;语音增强;语音信息处理
英文摘要: In general, non-stationary random character is always used in speech processing technologies such as speech recognition and speech enhancement. And short-time speech frame of 10-30ms is always adopted as basic processing unit in these technologies. However, the integrity of Chinese syllable is destroyed by this method and the speech processing performance is affected inevitably. In this project, speaker dependent Chinese words entirety speech recognition, Chinese character tone recognition, same semantics spectrogram geometric transformation of different speaker, speech enhancement method and the basic algorithm system of these which are based on spectrogram image information platform will be studied. The research of the project contributes to the entirety speech recognition of Chinese words, phrase and sentence efficiently. The study on Chinese character tone recognition also can be the foundation of Chinese emotion recognition. Speaker-independent semantics recognition can be realized by same semantics spectrogram geometric transformation of different speaker. And since the spectrogram is chosen as research data platform it is relatively easy to separate the signal and noise on frequency area. That is helpful to improve the performance of speech enhancement.
英文关键词: Acoustic signal processing;speech recognition;speech enhancement;speech information processing