With the proliferation of online misinformation, fake news detection has gained importance in the artificial intelligence community. In this paper, we propose an adversarial benchmark that tests the ability of fake news detectors to reason about real-world facts. We formulate adversarial attacks that target three aspects of "understanding": compositional semantics, lexical relations, and sensitivity to modifiers. We test our benchmark using BERT classifiers fine-tuned on the LIAR arXiv:arch-ive/1705648 and Kaggle Fake-News datasets, and show that both models fail to respond to changes in compositional and lexical meaning. Our results strengthen the need for such models to be used in conjunction with other fact checking methods.
翻译:随着在线错误信息的扩散,假新闻探测在人工智能界的重要性已经增加。 在本文中,我们提出了一个对抗性基准,测试假新闻探测器的能力以了解真实世界的事实。我们制定了针对“理解”的三个方面的对抗性攻击:组成语义、词汇关系和对修饰者的敏感度。我们使用BERT分类器测试我们的基准,该分类器在LiAR arxiv:arch-ive/1705648和Kaggle Fake-News数据集上进行了微调,并显示两种模型都无法对组成和词汇含义的变化作出反应。我们的结果使得这些模型更有必要与其他事实核对方法一起使用。