Traditional screening practices for anxiety and depression pose an impediment to monitoring and treating these conditions effectively. However, recent advances in NLP and speech modelling allow textual, acoustic, and hand-crafted language-based features to jointly form the basis of future mental health screening and condition detection. Speech is a rich and readily available source of insight into an individual's cognitive state and by leveraging different aspects of speech, we can develop new digital biomarkers for depression and anxiety. To this end, we propose a multi-modal system for the screening of depression and anxiety from self-administered speech tasks. The proposed model integrates deep-learned features from audio and text, as well as hand-crafted features that are informed by clinically-validated domain knowledge. We find that augmenting hand-crafted features with deep-learned features improves our overall classification F1 score comparing to a baseline of hand-crafted features alone from 0.58 to 0.63 for depression and from 0.54 to 0.57 for anxiety. The findings of our work suggest that speech-based biomarkers for depression and anxiety hold significant promise in the future of digital health.
翻译:对焦虑和抑郁症的传统筛查做法妨碍了对这些症状的有效监测和治疗,然而,最近国家语言定位和语音建模的进展使得基于语言的文字、声学和手工制作的特征能够共同构成未来心理健康筛查和状况检测的基础; 演讲是深入了解个人认知状况的丰富和现成的资料来源,通过利用言论的不同方面,我们可以开发新的压抑和焦虑数字生物标志; 为此,我们提议了一种多模式系统,用于筛查自我管理的演讲任务所产生的抑郁和焦虑; 拟议的模型结合了来自听音和文字的深层学习特征以及基于临床有效领域知识的手工制作特征; 我们发现,增加带有深层次学习特征的手工制作特征,改善了我们的总体分类F1分数,与手制作特征基线相比,仅从0.58到0.63分用于抑郁症,从0.54到0.57分用于焦虑。