Speech pause is an effective biomarker in dementia detection. Recent deep learning models have exploited speech pauses to achieve highly accurate dementia detection, but have not exploited the interpretability of speech pauses, i.e., what and how positions and lengths of speech pauses affect the result of dementia detection. In this paper, we will study the positions and lengths of dementia-sensitive pauses using adversarial learning approaches. Specifically, we first utilize an adversarial attack approach by adding the perturbation to the speech pauses of the testing samples, aiming to reduce the confidence levels of the detection model. Then, we apply an adversarial training approach to evaluate the impact of the perturbation in training samples on the detection model. We examine the interpretability from the perspectives of model accuracy, pause context, and pause length. We found that some pauses are more sensitive to dementia than other pauses from the model's perspective, e.g., speech pauses near to the verb "is". Increasing lengths of sensitive pauses or adding sensitive pauses leads the model inference to Alzheimer's Disease, while decreasing the lengths of sensitive pauses or deleting sensitive pauses leads to non-AD.
翻译:在痴呆症检测中,最近深层的学习模式利用了言语暂停来达到高度准确的痴呆症检测,但没有利用言语暂停的可解释性,即言语暂停的姿势和长度如何影响痴呆症检测的结果。在本文中,我们将使用对抗性学习方法研究痴呆症敏感暂停的姿势和长度。具体地说,我们首先使用对抗性攻击方法,在测试样本的言语暂停中加上扰动,目的是降低检测模式的信心水平。然后,我们采用对抗性培训方法来评估语音暂停的可解释性,即语音暂停的姿势和长度如何影响痴呆滞症检测的结果。我们从模型准确性、暂停背景和暂停长度的角度审视了可解释性。我们发现,有些暂停对痴呆症的感知度比模型中的其他暂停更敏感,例如,在动词“是”处的言语暂停。敏感暂停或增加敏感暂停的长度导致模型进入了对ASYSDA的敏感性暂停度,同时降低了敏感性暂停的长度。