Automatic speech recognition is a difficult problem in pattern recognition because several sources of variability exist in the speech input like the channel variations, the input might be clean or noisy, the speakers may have different accent and variations in the gender, etc. As a result, domain adaptation is important in speech recognition where we train the model for a particular source domain and test it on a different target domain. In this paper, we propose a technique to perform unsupervised gender-based domain adaptation in speech recognition using phonetic features. The experiments are performed on the TIMIT dataset and there is a considerable decrease in the phoneme error rate using the proposed approach.
翻译:在模式识别方面,自动语音识别是一个困难的问题,因为语音输入中存在多种变异来源,如频道变异,输入可能是干净的或吵闹的,发言者可能具有不同的口音和性别等。 因此,在语音识别中,在为某一特定源域培训模型并在不同的目标领域测试模型时,对域的适应很重要。在本文中,我们建议采用一种技术,在语音识别中进行不受监督的基于性别的变异。实验是在TIMIT数据集上进行的,使用拟议方法的电话错误率大幅下降。