From health to education, income impacts a huge range of life choices. Many papers have leveraged data from online social networks to study precisely this. In this paper, we ask the opposite question: do different levels of income result in different online behaviors? We demonstrate it does. We present the first large-scale study of Nextdoor, a popular location-based social network. We collect 2.6 Million posts from 64,283 neighborhoods in the United States and 3,325 neighborhoods in the United Kingdom, to examine whether online discourse reflects the income and income inequality of a neighborhood. We show that posts from neighborhoods with different income indeed differ, e.g. richer neighborhoods have a more positive sentiment and discuss crimes more, even though their actual crime rates are much lower. We then show that user-generated content can predict both income and inequality. We train multiple machine learning models and predict both income (R-Square=0.841) and inequality (R-Square=0.77).
翻译:从健康到教育,收入影响了众多生活选择。许多论文利用在线社交网络的数据来研究这一点。在本文中,我们提出了反向的问题:不同的收入水平是否会导致不同的在线行为?我们证明确实如此。我们提出了Nextdoor的大规模研究,这是一个流行的基于位置的社交网络。我们收集了来自美国64,283个社区和英国3,325个社区的260万个帖子,以研究在线话语是否反映了一个社区的收入和收入不平等。我们展示富裕社区的帖子与穷人社区的不同,例如,富人社区具有更积极的情感,并更多地讨论犯罪,尽管他们实际的犯罪率低得多。然后,我们展示了用户生成的内容可以预测收入和不平等。我们训练了多个机器学习模型,并预测了收入(R-Square=0.841)和不平等(R-Square=0.77)。