We developed dysarthric speech intelligibility classifiers on 551,176 disordered speech samples contributed by a diverse set of 468 speakers, with a range of self-reported speaking disorders and rated for their overall intelligibility on a five-point scale. We trained three models following different deep learning approaches and evaluated them on ~94K utterances from 100 speakers. We further found the models to generalize well (without further training) on the TORGO database (100% accuracy), UASpeech (0.93 correlation), ALS-TDI PMP (0.81 AUC) datasets as well as on a dataset of realistic unprompted speech we gathered (106 dysarthric and 76 control speakers,~2300 samples). To advance research in this domain, we share one of our models at https://tfhub.dev/google/euphonia_spice/classification/1.
翻译:我们开发了一套551,176个无序语音分类仪,由一组468名发言者提供,具有一系列自我报告的言语障碍,并按五点比例评定其总体可知性,我们按照不同的深层次学习方法培训了三种模式,并对100名发言者的~94K语句进行了评估,我们还发现这些模式可以(无需进一步培训)对TORGO数据库(100%准确性)、UASpeech(0.93相关)、ALS-TDI PMP(0.81 ACUC)数据集以及我们收集的一套现实的、不受启发的演讲数据集(106个dysarthric和76个控制发言者,~2300个样本)进行广泛归纳,为了推进这一领域的研究,我们在https://tfhub.dev/google/eophonia_spice/clasgification/1上分享了我们的一个模型。</s>