The spread of COVID-19 has sparked racism and hate on social media targeted towards Asian communities. However, little is known about how racial hate spreads during a pandemic and the role of counterspeech in mitigating this spread. In this work, we study the evolution and spread of anti-Asian hate speech through the lens of Twitter. We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months, containing over 206 million tweets, and a social network with over 127 million nodes. By creating a novel hand-labeled dataset of 3,355 tweets, we train a text classifier to identify hate and counterspeech tweets that achieves an average macro-F1 score of 0.832. Using this dataset, we conduct longitudinal analysis of tweets and users. Analysis of the social network reveals that hateful and counterspeech users interact and engage extensively with one another, instead of living in isolated polarized communities. We find that nodes were highly likely to become hateful after being exposed to hateful content. Notably, counterspeech messages may discourage users from turning hateful, potentially suggesting a solution to curb hate on web and social media platforms. Data and code is at http://claws.cc.gatech.edu/covid.
翻译:COVID-19的传播在针对亚洲社区的社交媒体上引发了种族主义和仇恨;然而,对于在大流行病期间种族仇恨如何蔓延以及反言在减缓这种蔓延方面的作用,我们知之甚少;在这项工作中,我们通过Twitter镜头研究反亚洲仇恨言论的演变和蔓延;我们创建了COVID-HATE,这是反亚仇恨和反言的最大数据集,覆盖14个月,包含2.06亿次推特,以及一个拥有超过2.27亿个节点的社会网络。我们发现,通过创建3 355个推特的新手贴数据集,我们培训了一个文本分类员,以查明仇恨和反言词,达到平均0.832分的宏观-F1分。我们利用这一数据集,对推特和用户进行纵向分析。对社交网络的分析显示,仇恨和反言词用户相互交流和广泛接触,而不是生活在孤立的极地社区。我们发现,在接触到仇恨内容后,节点极有可能变得令人憎恶。 很明显,反言的讯息可能会阻止用户在网络/媒体上改变仇恨的代码。