As the spread of false information on the internet has increased dramatically in recent years, more and more attention is being paid to automated fake news detection. Some fake news detection methods are already quite successful. Nevertheless, there are still many vulnerabilities in the detection algorithms. The reason for this is that fake news publishers can structure and formulate their texts in such a way that a detection algorithm does not expose this text as fake news. This paper shows that it is possible to automatically attack state-of-the-art models that have been trained to detect Fake News, making these vulnerable. For this purpose, corresponding models were first trained based on a dataset. Then, using Text-Attack, an attempt was made to manipulate the trained models in such a way that previously correctly identified fake news was classified as true news. The results show that it is possible to automatically bypass Fake News detection mechanisms, leading to implications concerning existing policy initiatives.
翻译:随着近年来互联网上虚假信息的传播急剧增加,人们越来越关注自动假新闻检测,一些假新闻检测方法已经相当成功。 尽管如此,在检测算法中仍然存在着许多弱点。 其原因是,虚假新闻出版商能够以这样一种方式构建和编写文本,即检测算法不会将这一文本暴露为假新闻。 本文表明,有可能自动攻击经过训练的、能够发现假新闻的最新模型,从而使这些模型变得脆弱。 为此,对相应的模型首先进行了基于数据集的培训。 然后,利用文本-Attack,试图操纵经过培训的模型,从而将先前正确识别的假新闻归类为真实新闻。结果显示,有可能自动绕过假新闻检测机制,从而导致对现有政策倡议的影响。