Depression detection using vocal biomarkers is a highly researched area. Articulatory coordination features (ACFs) are developed based on the changes in neuromotor coordination due to psychomotor slowing, a key feature of Major Depressive Disorder. However findings of existing studies are mostly validated on a single database which limits the generalizability of results. Variability across different depression databases adversely affects the results in cross corpus evaluations (CCEs). We propose to develop a generalized classifier for depression detection using a dilated Convolutional Neural Network which is trained on ACFs extracted from two depression databases. We show that ACFs derived from Vocal Tract Variables (TVs) show promise as a robust set of features for depression detection. Our model achieves relative accuracy improvements of ~10% compared to CCEs performed on models trained on a single database. We extend the study to show that fusing TVs and Mel-Frequency Cepstral Coefficients can further improve the performance of this classifier.
翻译:使用声波生物标记器进行抑郁症检测是一个高度研究的领域。根据神经摩托协调的变化(精神运动减速是主要抑郁症的主要特征之一)开发了人工协调功能(ACFs),但现有研究结果大多在一个单一数据库中验证,该数据库限制了结果的可概括性。不同抑郁症数据库的可变性对跨物理评估的结果产生了不利影响。我们提议利用一个在两个抑郁症数据库中提取的ACFs培训的膨胀脉冲神经网络,开发一个用于抑郁症检测的通用分类器。我们表明,从Vocal Tract变量(TVs)中得出的ACF有希望作为一套稳健的抑郁症检测特征。我们的模型比在单一数据库中培训的模型的CECs提高了约10%的相对精度。我们扩展了研究,以显示使用电视和Mel-Frencent Cepstral Covalations能够进一步改善该分类器的性能。