As the COVID-19 pandemic sweeps across the world, it has been accompanied by a tsunami of fake news and misinformation on social media. At the time when reliable information is vital for public health and safety, COVID-19 related fake news has been spreading even faster than the facts. During times such as the COVID-19 pandemic, fake news can not only cause intellectual confusion but can also place lives of people at risk. This calls for an immediate need to contain the spread of such misinformation on social media. We introduce CTF, the first COVID-19 Twitter fake news dataset with labeled genuine and fake tweets. Additionally, we propose Cross-SEAN, a cross-stitch based semi-supervised end-to-end neural attention model, which leverages the large amount of unlabelled data. Cross-SEAN partially generalises to emerging fake news as it learns from relevant external knowledge. We compare Cross-SEAN with seven state-of-the-art fake news detection methods. We observe that it achieves $0.95$ F1 Score on CTF, outperforming the best baseline by $9\%$. We also develop Chrome-SEAN, a Cross-SEAN based chrome extension for real-time detection of fake tweets.
翻译:由于COVID-19大流行遍及全世界,随之而来的还有一场假新闻和社交媒体错误信息海啸,当可靠的信息对公众健康和安全至关重要时,COVID-19相关假新闻的传播速度比事实快得多。在COVID-19大流行期间,假新闻不仅可能造成智力混乱,而且可能危及人们的生命。这就要求立即需要遏制这种错误信息在社交媒体上的传播。我们引入了CTF,即第一个CVD-19Twitter假新闻数据集,贴上真实和假推文标签。此外,我们提议采用跨SEAN,一个基于半监督的半监督终端到终端神经关注模型,利用大量未贴标签的数据。跨SEWE部分的概括性,以从相关外部知识中学习的假新闻为新出现。我们把跨SEAN与七个最先进的假新闻探测方法进行比较。我们观察到,CTF的创价为0.95法郎,比实际检测9美元。我们还开发了基于CHIS的假S-Cross。