Automated fact-checking is a needed technology to curtail the spread of online misinformation. One current framework for such solutions proposes to verify claims by retrieving supporting or refuting evidence from related textual sources. However, the realistic use cases for fact-checkers will require verifying claims against evidence sources that could be affected by the same misinformation. Furthermore, the development of modern NLP tools that can produce coherent, fabricated content would allow malicious actors to systematically generate adversarial disinformation for fact-checkers. In this work, we explore the sensitivity of automated fact-checkers to synthetic adversarial evidence in two simulated settings: AdversarialAddition, where we fabricate documents and add them to the evidence repository available to the fact-checking system, and AdversarialModification, where existing evidence source documents in the repository are automatically altered. Our study across multiple models on three benchmarks demonstrates that these systems suffer significant performance drops against these attacks. Finally, we discuss the growing threat of modern NLG systems as generators of disinformation in the context of the challenges they pose to automated fact-checkers.
翻译:自动化事实检查是限制网上错误信息传播的一项必要技术。这种解决办法的一个现行框架建议通过从相关文本来源获取支持或反驳证据来核实索赔要求。然而,对事实检查员的实际使用案例将要求对照可能受到同一错误信息影响的证据来源核实索赔要求。此外,开发现代NLP工具,能够产生一致、伪造的内容,使恶意行为者能够系统地为事实检查者生成对抗性虚假信息。在这项工作中,我们探讨了自动化事实检查员在两个模拟环境中对合成对抗性证据的敏感性:AversariAddition,我们在那里编造文件并将其添加到为事实审查系统提供的证据储存库中,以及AversarialMication,那里现有的证据来源文件会自动被修改。我们在三个基准的多个模型上的研究显示,这些系统在发生这些攻击时,其性能显著下降。最后,我们讨论了现代NLG系统在对自动事实检查员构成的挑战时,作为不实信息的生成者所面临的日益增长的威胁。