Misinformation emerges in times of uncertainty when credible information is limited. This is challenging for NLP-based fact-checking as it relies on counter-evidence, which may not yet be available. Despite increasing interest in automatic fact-checking, it is still unclear if automated approaches can realistically refute harmful real-world misinformation. Here, we contrast and compare NLP fact-checking with how professional fact-checkers combat misinformation in the absence of counter-evidence. In our analysis, we show that, by design, existing NLP task definitions for fact-checking cannot refute misinformation as professional fact-checkers do for the majority of claims. We then define two requirements that the evidence in datasets must fulfill for realistic fact-checking: It must be (1) sufficient to refute the claim and (2) not leaked from existing fact-checking articles. We survey existing fact-checking datasets and find that all of them fail to satisfy both criteria. Finally, we perform experiments to demonstrate that models trained on a large-scale fact-checking dataset rely on leaked evidence, which makes them unsuitable in real-world scenarios. Taken together, we show that current NLP fact-checking cannot realistically combat real-world misinformation because it depends on unrealistic assumptions about counter-evidence in the data.
翻译:在可靠信息有限的情况下,在不确定的时期会出现错误信息。 这对基于NLP的基于事实的核查具有挑战性,因为它依靠的是反证,可能还不具备。 尽管对自动事实检查的兴趣日益浓厚,但自动化方法是否能够现实地驳斥有害的真实世界错误信息仍不清楚。在这里,我们对比和比较NLP的实况调查方法与专业事实检查者在没有反证的情况下如何打击错误信息。在我们的分析中,我们通过设计,现有基于事实检查的NLP任务定义无法驳斥错误信息,因为专业事实检查者对大多数索赔都是这样做的。我们随后界定了两个要求,即数据集中的证据必须满足现实的实况调查要求:(1) 足以反驳这一要求,(2) 不从现有的事实检查文章中泄漏。我们调查现有的事实检查数据集,发现它们都不符合两个标准。最后,我们进行实验,以证明大规模事实核对数据集所训练的模型依赖于泄露的证据,而这种证据使得它们不能在现实世界的假设中达到不现实的正确性。我们不得不证明,因为现在的错误的假设是无法在现实世界中共同进行不现实的错误的假设。