How does language differ across one's Facebook status updates vs. one's text messages (SMS)? In this study, we show how Facebook and SMS use differs in psycho-linguistic characteristics and how these differences drive downstream analyses with an illustration of depression diagnosis. We use a sample of consenting participants who shared Facebook status updates, SMS data, and answered a standard psychological depression screener. We quantify domain differences using psychologically driven lexical methods and find that language on Facebook involves more personal concerns, experiences, and content features while the language in SMS contains more informal and style features. Next, we estimate depression from both text domains, using a depression model trained on Facebook data, and find a drop in accuracy when predicting self-reported depression assessments from the SMS-based depression estimates. Finally, we evaluate a simple domain adaption correction based on words driving the cross-platform differences and applied it to the SMS-derived depression estimates, resulting in significant improvement in prediction. Our work shows the Facebook vs. SMS difference in language use and suggests the necessity of cross-domain adaption for text-based predictions.
翻译:在这项研究中,我们展示了Facebook和SMS在精神语言特征方面的使用差异,以及这些差异如何驱动下游分析,并举例说明了抑郁症诊断。我们使用同意的参与者样本,他们分享了Facebook状态更新、SMS数据,并回答了标准的心理抑郁症筛查员。我们用心理驱动的词汇法方法量化了域差异,发现Facebook上的语言包含更多的个人关切、经验和内容特征,而短信中的语言则包含更多的非正式和风格特征。接下来,我们用一个受Facebook数据培训的抑郁症模型从两个文本域中估算抑郁症,并在预测基于SMS的抑郁症估计数的自我报告的抑郁症评估时发现准确性下降。最后,我们根据驱动跨平台差异的词对一个简单的域进行了调整,并将其应用到SMS得出的抑郁症估计中,从而大大改进了预测。我们的工作显示,Facebook和SMS在语言使用方面的差异,并表明有必要对基于文本的预测进行交叉调整。