Social media generate data on human behaviour at large scales and over long periods of time, posing a complementary approach to traditional methods in the social sciences. Millions of texts from social media can be processed with computational methods to study emotions over time and across regions. However, recent research has shown weak correlations between social media emotions and affect questionnaires at the individual level and between static regional aggregates of social media emotion and subjective well-being at the population level, questioning the validity of social media data to study emotions. Yet, to date, no research has tested the validity of social media emotion macroscopes to track the temporal evolution of emotions at the level of a whole society. Here we present a pre-registered prediction study that shows how gender-rescaled time series of Twitter emotional expression at the national level substantially correlate with aggregates of self-reported emotions in a weekly representative survey in the United Kingdom. A follow-up exploratory analysis shows a high prevalence of third-person references in emotionally-charged tweets, indicating that social media data provide a way of social sensing the emotions of others rather than just the emotional experiences of users. These results show that, despite the issues that social media have in terms of representativeness and algorithmic confounding, the combination of advanced text analysis methods with user demographic information in social media emotion macroscopes can provide measures that are informative of the general population beyond social media users.
翻译:然而,最近的研究表明,社交媒体情感和主观福祉在个人层面以及社会媒体情感和人口层面的静态区域汇总中影响问卷的关联性薄弱,质疑社交媒体数据研究情感的正确性。然而,迄今为止,没有任何研究测试社交媒体情感宏观镜的正确性,以跟踪全社会情感的瞬时演变。我们在此提出一个预先登记的预测研究,显示国家一级Twitter情感表达的性别比例调整时间序列如何与联合王国每周一次代表性调查中自我报告的情绪汇总关系密切。后续探索分析显示,在情感驱动的推文中,第三人引用的比例很高,表明社交媒体数据为社会感知他人情感提供了一种途径,而不只是用户的情感体验。这些结果显示,尽管社会媒体用户在社会信息分析中拥有超越一般人口动态分析方法的先进版本,但是,社会媒体用户在人口动态分析中可以提供超越一般人口动态分析方法的先进版本。