COVID-19 pandemic has generated what public health officials called an infodemic of misinformation. As social distancing and stay-at-home orders came into effect, many turned to social media for socializing. This increase in social media usage has made it a prime vehicle for the spreading of misinformation. This paper presents a mechanism to detect COVID-19 health-related misinformation in social media following an interdisciplinary approach. Leveraging social psychology as a foundation and existing misinformation frameworks, we defined misinformation themes and associated keywords incorporated into the misinformation detection mechanism using applied machine learning techniques. Next, using the Twitter dataset, we explored the performance of the proposed methodology using multiple state-of-the-art machine learning classifiers. Our method shows promising results with at most 78% accuracy in classifying health-related misinformation versus true information using uni-gram-based NLP feature generations from tweets and the Decision Tree classifier. We also provide suggestions on alternatives for countering misinformation and ethical consideration for the study.
翻译:COVID-19大流行产生了公共卫生官员所称的错误信息。随着社会不和和居家秩序的生效,许多人转向社交媒体进行社交化。社交媒体的使用量的增加使得它成为传播错误信息的主要工具。本文提供了一个机制,以跨学科方式在社交媒体中发现与健康有关的错误信息。利用社会心理学作为基础和现有的错误信息框架,我们利用应用机器学习技术界定错误信息主题和相关关键词纳入错误信息检测机制。接下来,我们利用Twitter数据集,利用多种最先进的机器学习分类方法探索了拟议方法的绩效。我们的方法显示,在利用来自推特和决定树分类器的几代人以单语为基础的NLP特征对健康错误信息与真实信息进行分类方面,最多78%的准确率。我们还就打击错误信息及研究伦理考虑的替代方法提出了建议。