Early diagnosis of Alzheimer's disease (AD) is crucial in facilitating preventive care to delay further progression. This paper presents the development of a state-of-the-art Conformer based speech recognition system built on the DementiaBank Pitt corpus for automatic AD detection. The baseline Conformer system trained with speed perturbation and SpecAugment based data augmentation is significantly improved by incorporating a set of purposefully designed modeling features, including neural architecture search based auto-configuration of domain-specific Conformer hyper-parameters in addition to parameter fine-tuning; fine-grained elderly speaker adaptation using learning hidden unit contributions (LHUC); and two-pass cross-system rescoring based combination with hybrid TDNN systems. An overall word error rate (WER) reduction of 13.6% absolute (34.8% relative) was obtained on the evaluation data of 48 elderly speakers. Using the final systems' recognition outputs to extract textual features, the best-published speech recognition based AD detection accuracy of 91.7% was obtained.
翻译:早期诊断阿尔茨海默氏病(AD)对于促进预防性护理以延缓进一步的进展至关重要。本文件介绍了在Dementia Bankk Pittamp 上开发一个基于最先进的基于Confer 的语音识别系统,用于自动自动自动检测。基准Confer 系统通过纳入一套专门设计的模型特征而得到显著改进,其中包括基于神经结构的自动配置,除参数微调外,还基于特定域的超强参数的自动配置;使用学习的隐藏单位贡献(LHUC)进行精密的老年人语音识别系统调整;与混合的TDNNN系统进行双空跨系统连接。根据48名老年人的评价数据,实现了13.6%的绝对(34.8%)的总体误差率(WER)下降。利用最后的系统识别输出来提取文字特征,获得了91.7%基于自动检测精度的最佳公开语音识别。