The outbreak of the COVID-19 pandemic has changed our lives in unprecedented ways. In the face of the projected catastrophic consequences, many countries have enacted social distancing measures in an attempt to limit the spread of the virus. Under these conditions, the Web has become an indispensable medium for information acquisition, communication, and entertainment. At the same time, unfortunately, the Web is being exploited for the dissemination of potentially harmful and disturbing content, such as the spread of conspiracy theories and hateful speech towards specific ethnic groups, in particular towards Chinese people since COVID-19 is believed to have originated from China. In this paper, we make a first attempt to study the emergence of Sinophobic behavior on the Web during the outbreak of the COVID-19 pandemic. We collect two large-scale datasets from Twitter and 4chan's Politically Incorrect board (/pol/) over a time period of approximately five months and analyze them to investigate whether there is a rise or important differences with regard to the dissemination of Sinophobic content. We find that COVID-19 indeed drives the rise of Sinophobia on the Web and that the dissemination of Sinophobic content is a cross-platform phenomenon: it exists on fringe Web communities like \dspol, and to a lesser extent on mainstream ones like Twitter. Also, using word embeddings over time, we characterize the evolution and emergence of new Sinophobic slurs on both Twitter and /pol/. Finally, we find interesting differences in the context in which words related to Chinese people are used on the Web before and after the COVID-19 outbreak: on Twitter we observe a shift towards blaming China for the situation, while on /pol/ we find a shift towards using more (and new) Sinophobic slurs.
翻译:面对预计的灾难性后果,许多国家首次尝试研究在互联网上出现的仇视中国行为,以试图限制病毒的传播。在这样的条件下,互联网已成为信息获取、通信和娱乐不可或缺的媒介。与此同时,不幸的是,互联网被利用来传播潜在有害和令人不安的内容,例如阴谋理论的传播和针对特定族裔群体的仇恨言论,特别是针对中国人的仇恨言论,自COVID-19据信起源于中国以来,这种言论已经以前所未有的方式改变了我们的生活。在本文中,我们首次尝试研究在互联网上出现的仇视中国的行为,以试图限制病毒的传播。在这种条件下,我们从Twitter和4chan的政治错误版(/Pol/)收集了两个大型数据集,在大约五个月的时间段里,我们收集了两个大型数据集,用来传播潜在有害和令人不安的内容,例如:阴谋理论传播,在互联网上发现COVID-19的言论,在互联网上,我们发现我们内部的仇视情绪正在上升,在互联网上传播的言论,而在互联网上,在互联网上,我们使用的是跨平台上,在服务器上,在服务器上也存在类似时差现象。