The COVID-19 epidemic is considered as the global health crisis of the whole society and the greatest challenge mankind faced since World War Two. Unfortunately, the fake news about COVID-19 is spreading as fast as the virus itself. The incorrect health measurements, anxiety, and hate speeches will have bad consequences on people's physical health, as well as their mental health in the whole world. To help better combat the COVID-19 fake news, we propose a new fake news detection dataset MM-COVID(Multilingual and Multidimensional COVID-19 Fake News Data Repository). This dataset provides the multilingual fake news and the relevant social context. We collect 3981 pieces of fake news content and 7192 trustworthy information from English, Spanish, Portuguese, Hindi, French and Italian, 6 different languages. We present a detailed and exploratory analysis of MM-COVID from different perspectives and demonstrate the utility of MM-COVID in several potential applications of COVID-19 fake news study on multilingual and social media.
翻译:COVID-19流行病被认为是整个社会的全球健康危机,是人类自第二次世界大战以来面临的最大挑战。 不幸的是,关于COVID-19的假消息的传播速度与病毒本身一样快。不正确的健康测量、焦虑和仇恨言论将给全世界人民的身体健康及其心理健康带来不良后果。为了更好地打击COVID-19的假消息,我们提议建立一个新的假新闻探测数据集MM-COVID(多语言和多语言的Dublo COVID-19 Fake新闻数据存储库)。这个数据集提供了多语言的假新闻和相关的社会背景。我们收集了3981个假新闻内容和7192个来自英文、西班牙文、葡萄牙文、印地文、法文和意大利文的可靠信息,6种不同的语言。我们从不同角度对MM-COVID进行了详细和探索性的分析,并展示了M-COVID在多语言和社会媒体上可能应用COVID-19假新闻研究中的效用。