Misinformation threatens modern society by promoting distrust in science, changing narratives in public health, heightening social polarization, and disrupting democratic elections and financial markets, among a myriad of other societal harms. To address this, a growing cadre of professional fact-checkers and journalists provide high-quality investigations into purported facts. However, these largely manual efforts have struggled to match the enormous scale of the problem. In response, a growing body of Natural Language Processing (NLP) technologies have been proposed for more scalable fact-checking. Despite tremendous growth in such research, however, practical adoption of NLP technologies for fact-checking still remains in its infancy today. In this work, we review the capabilities and limitations of the current NLP technologies for fact-checking. Our particular focus is to further chart the design space for how these technologies can be harnessed and refined in order to better meet the needs of human fact-checkers. To do so, we review key aspects of NLP-based fact-checking: task formulation, dataset construction, modeling, and human-centered strategies, such as explainable models and human-in-the-loop approaches. Next, we review the efficacy of applying NLP-based fact-checking tools to assist human fact-checkers. We recommend that future research include collaboration with fact-checker stakeholders early on in NLP research, as well as incorporation of human-centered design practices in model development, in order to further guide technology development for human use and practical adoption. Finally, we advocate for more research on benchmark development supporting extrinsic evaluation of human-centered fact-checking technologies.
翻译:错误信息威胁现代社会,其方法是促进科学不信任,改变公共卫生的叙事,加剧社会两极分化,破坏民主选举和金融市场,以及其他各种社会伤害。为了解决这个问题,越来越多的专业事实检查人员和记者骨干对事实进行了高质量的调查。然而,这些主要是人工操作的努力已经难以与问题的巨大规模相匹配。作为回应,越来越多的自然语言处理(NLP)技术被提议进行更可扩缩的实况调查。尽管这种研究取得了巨大增长,但实际采用NLP技术进行事实检查的技术至今仍处于初级阶段。我们在此工作中审查目前NLP技术进行事实检查的能力和局限性。我们的具体重点是进一步规划如何利用和完善这些技术,以更好地满足人类事实检查者的需求。我们为此审查了基于NLP(NLP)处理技术的利益攸关方进一步核对的关键方面:任务制定、数据设置、建模、建模和以人为本的战略,例如支持可解释的模型和当前NL(NL)技术技术技术的开发,以及今后在应用人类数据分析工具上采用数据分析方法。我们今后在研究中进行数据分析时,将技术的系统化研究- 将人权- 将技术的升级用于数据分析, 将技术的早期研究- 将技术的运用,将技术用于数据分析, 将技术的早期研究- 将技术的运用纳入,作为人权- 将技术的早期研究- 用于对数据分析- 用于对事实的早期研究- 将技术的改进- 将技术研究- 将技术的早期研究- 用于对数据分析- 将数据分析- 将技术的早期研究- 将技术的改进- 将技术的运用- 将技术的早期的运用- 将技术的改进- 用于对数据- 将技术的运用到人类- 将技术的早期- 将-- 将技术的早期- 纳入- 纳入- 用于数据-结果- 用于- 将-- 将-- 用于- 将-- 将-- 将-- 将-- 将-- 纳入- 将-- 将-- 包括,将-- 复制-