Improving the user's hearing ability to understand speech in noisy environments is critical to the development of hearing aid (HA) devices. For this, it is important to derive a metric that can fairly predict speech intelligibility for HA users. A straightforward approach is to conduct a subjective listening test and use the test results as an evaluation metric. However, conducting large-scale listening tests is time-consuming and expensive. Therefore, several evaluation metrics were derived as surrogates for subjective listening test results. In this study, we propose a multi-branched speech intelligibility prediction model (MBI-Net), for predicting the subjective intelligibility scores of HA users. MBI-Net consists of two branches of models, with each branch consisting of a hearing loss model, a cross-domain feature extraction module, and a speech intelligibility prediction model, to process speech signals from one channel. The outputs of the two branches are fused through a linear layer to obtain predicted speech intelligibility scores. Experimental results confirm the effectiveness of MBI-Net, which produces higher prediction scores than the baseline system in Track 1 and Track 2 on the Clarity Prediction Challenge 2022 dataset.
翻译:提高用户听力能力以了解噪音环境中的言语能力对于开发助听器(HA)装置至关重要。 为此,重要的是要得出能够公平地预测助听器使用者的言语知觉度的衡量标准。一个直截了当的方法是进行主观听觉测试,并将测试结果用作评价指标。然而,进行大规模听觉测试既费时又费钱。因此,作为主观听力测试结果的代孕,可以得出若干评价指标。在这个研究中,我们提出了一个多断开的语音智能预测模型(MBI-Net),用于预测助听器使用者主观知觉分数。MBI-Net由两个模型分支组成,每个分支包括听力损失模型、跨部特征提取模块和语音感知觉预测模型,用于处理一个频道的语音信号。两个分支的产出通过线性层结合,以获得预测的语音感知觉分数。实验结果证实了MBI-Net的功效,其预测分数高于第1轨和第2轨基2022挑战数据。