Pitch and Formant frequencies are important features in speech processing applications. The period of the vocal cord's output for vowels is known as the pitch or the fundamental frequency, and formant frequencies are essentially resonance frequencies of the vocal tract. These features vary among different persons and even words, but they are within a certain frequency range. In practice, just the first three formants are enough for the most of speech processing. Feature extraction and classification are the main components of each speech recognition system. In this article, two wavelet based approaches are proposed to extract the mentioned features with help of the filter bank idea. By comparing the results of the presented feature extraction methods on several speech signals, it was found out that the wavelet transform has a good accuracy compared to the cepstrum method and it has no sensitivity to noise. In addition, several fuzzy based classification techniques for speech processing are reviewed.
翻译:语音处理应用中的重要特征是发音频率和发音频率。音频和发音频率的输出期被称为音频或基本频率,形成频率基本上是声频的共振频率。这些特征因人而异,甚至字词不同,但在一定的频率范围内。实际上,只有前三种形成器足以满足大部分语音处理的需要。每个语音识别系统的主要组成部分是特征提取和分类。在本篇文章中,提出了两种波谱法,以利用过滤库的想法来提取上述特征。通过比较几种语音信号上显示的特征提取方法的结果,发现波盘变异与立方法相比具有很高的准确性,而且对噪音没有敏感性。此外,还审查了几种以烟雾为基础的语音处理分类技术。