An accurate objective speech intelligibility prediction algorithms is of great interest for many applications such as speech enhancement for hearing aids. Most algorithms measures the signal-to-noise ratios or correlations between the acoustic features of clean reference signals and degraded signals. However, these hand-picked acoustic features are usually not explicitly correlated with recognition. Meanwhile, deep neural network (DNN) based automatic speech recogniser (ASR) is approaching human performance in some speech recognition tasks. This work leverages the hidden representations from DNN-based ASR as features for speech intelligibility prediction in hearing-impaired listeners. The experiments based on a hearing aid intelligibility database show that the proposed method could make better prediction than a widely used short-time objective intelligibility (STOI) based binaural measure.
翻译:准确客观的言语智能预测算法对于许多应用非常有意义,例如助听器语音增强等。大多数算法测量清洁参考信号和退化信号的声学特征之间的信号到噪音比率或相关关系。然而,这些手选的声学特征通常并不与识别明确相关。与此同时,基于深神经网络(DNN)的自动语音识别器(DNN)正在一些语音识别任务中接近人类的性能。这项工作利用基于DNN的ASR的隐性表达法作为听力障碍听众语音智能预测的特征。基于助听器智能数据库的实验表明,拟议方法比广泛使用的短期目标智能(STOI)基于双向测量法可以作出更好的预测。