Compared to physical health, population mental health measurement in the U.S. is very coarse-grained. Currently, in the largest population surveys, such as those carried out by the Centers for Disease Control or Gallup, mental health is only broadly captured through "mentally unhealthy days" or "sadness", and limited to relatively infrequent state or metropolitan estimates. Through the large scale analysis of social media data, robust estimation of population mental health is feasible at much higher resolutions, up to weekly estimates for counties. In the present work, we validate a pipeline that uses a sample of 1.2 billion Tweets from 2 million geo-located users to estimate mental health changes for the two leading mental health conditions, depression and anxiety. We find moderate to large associations between the language-based mental health assessments and survey scores from Gallup for multiple levels of granularity, down to the county-week (fixed effects $\beta = .25$ to $1.58$; $p<.001$). Language-based assessment allows for the cost-effective and scalable monitoring of population mental health at weekly time scales. Such spatially fine-grained time series are well suited to monitor effects of societal events and policies as well as enable quasi-experimental study designs in population health and other disciplines. Beyond mental health in the U.S., this method generalizes to a broad set of psychological outcomes and allows for community measurement in under-resourced settings where no traditional survey measures - but social media data - are available.
翻译:与身体健康相比,美国的人口心理健康衡量方法非常粗糙,目前,在最大的人口调查中,例如疾病控制中心或Gallup进行的人口控制中心进行的人口调查中,心理健康只是通过“精神不健康日”或“疾病日”广泛得到的,仅限于相对不常见的州或大都市估计。通过大规模分析社会媒体数据,对人口心理健康进行稳健估计是可行的,其分辨率要高得多,最高可达各州的每周估计数。在目前的工作中,我们验证了一条管道,利用200万地理用户的120亿图韦特样本来估计两大心理健康状况、抑郁症和焦虑症的心理健康变化。我们发现,基于语言的心理健康评估和Gallup对多层次颗粒性(下至县周)的调查分数之间有中等和大的联系。通过大规模分析社会媒体数据分析, 美元=2.5美元至1.58美元; 美元<0.001美元]。基于语言的评估允许每周对人口心理健康进行成本有效和可计量的监测,但可测量范围为200万地理媒体用户的12亿图,用于估计两大心理健康状况变化。这种基于空间测量的测量测量尺度的测量方法,使社会计量系统下的现有数据能够对人口进行其他计量和心理健康影响进行更精确的研究。</s>