The speech of people with Parkinson's Disease (PD) has been shown to hold important clues about the presence and progression of the disease. We investigate the factors based on which humans experts make judgments of the presence of disease in speech samples over five different speech tasks: phonations, sentence repetition, reading, recall, and picture description. We make comparisons by conducting listening tests to determine clinicians accuracy at recognizing signs of PD from audio alone, and we conduct experiments with a machine learning system for detection based on Whisper. Across tasks, Whisper performs on par or better than human experts when only audio is available, especially on challenging but important subgroups of the data: younger patients, mild cases, and female patients. Whisper's ability to recognize acoustic cues in difficult cases complements the multimodal and contextual strengths of human experts.
翻译:研究表明,帕金森病(PD)患者的语音特征蕴含着疾病存在与进展的重要线索。本研究基于五种不同的语音任务(发声、句子复述、朗读、回忆和图片描述),探究人类专家判断语音样本中是否存在疾病的依据。我们通过听力测试比较临床医生仅凭音频识别帕金森病体征的准确率,并基于Whisper构建机器学习检测系统进行实验。在所有任务中,当仅提供音频时,Whisper的表现与人类专家相当或更优,尤其在数据中具有挑战性但重要的子组中表现突出:年轻患者、轻度病例及女性患者。Whisper在疑难病例中识别声学线索的能力,与人类专家的多模态及情境分析优势形成互补。