We describe the development of a model to detect user-level clinical depression based on a user's temporal social media posts. Our model uses a Depression Symptoms Detection (DSD) classifier, which is trained on the largest existing samples of clinician annotated tweets for clinical depression symptoms. We subsequently use our DSD model to extract clinically relevant features, e.g., depression scores and their consequent temporal patterns, as well as user posting activity patterns, e.g., quantifying their ``no activity'' or ``silence.'' Furthermore, to evaluate the efficacy of these extracted features, we create three kinds of datasets including a test dataset, from two existing well-known benchmark datasets for user-level depression detection. We then provide accuracy measures based on single features, baseline features and feature ablation tests, at several different levels of temporal granularity. The relevant data distributions and clinical depression detection related settings can be exploited to draw a complete picture of the impact of different features across our created datasets. Finally, we show that, in general, only semantic oriented representation models perform well. However, clinical features may enhance overall performance provided that the training and testing distribution is similar, and there is more data in a user's timeline. The consequence is that the predictive capability of depression scores increase significantly while used in a more sensitive clinical depression detection settings.
翻译:我们描述了开发一种模型的过程,该模型可以基于用户在社交媒体上的时间性帖子来检测用户级别的临床抑郁。我们的模型使用了抑郁症状检测(DSD)分类器,该分类器是基于目前已知最大的临床医生注释的推文样本进行培训的,以检测临床抑郁症状。因此,我们使用我们的DSD模型提取具有临床相关特征,例如抑郁得分及其随后的时间模式,以及用户发布活动模式,例如量化他们的“无活动”或“沉默”因而出现的特征。此外,为了评估这些提取的特征的效果,我们创建了三种数据集,其中包括一个测试数据集,这些数据集是基于两个现有的有名用户级别抑郁检测基准数据集创建的。然后,在多个不同级别的时间粒度下,我们提供基于单个特征,基线特征和特征削减测试的准确性措施。相关的数据分布和临床抑郁检测相关设置可用于分析在我们创建的数据集中,不同特征对结果的影响。最后,我们显示,通常,只有语义定向的表示模型具有良好的性能。然而,在训练和测试分布相似且用户时间轴中有更多数据的情况下,临床特征可增强总体性能。其结果是,在更敏感的临床抑郁检测设置中使用抑郁分数的预测能力显著提高。