Exponential growth in digital information outlets and the race to publish has made scientific misinformation more prevalent than ever. However, the task to fact-verify a given scientific claim is not straightforward even for researchers. Scientific claim verification requires in-depth knowledge and great labor from domain experts to substantiate supporting and refuting evidence from credible scientific sources. The SciFact dataset and corresponding task provide a benchmarking leaderboard to the community to develop automatic scientific claim verification systems via extracting and assimilating relevant evidence rationales from source abstracts. In this work, we propose a modular approach that sequentially carries out binary classification for every prediction subtask as in the SciFact leaderboard. Our simple classifier-based approach uses reduced abstract representations to retrieve relevant abstracts. These are further used to train the relevant rationale-selection model. Finally, we carry out two-step stance predictions that first differentiate non-relevant rationales and then identify supporting or refuting rationales for a given claim. Experimentally, our system RerrFact with no fine-tuning, simple design, and a fraction of model parameters fairs competitively on the leaderboard against large-scale, modular, and joint modeling approaches. We make our codebase available at https://github.com/ashishrana160796/RerrFact.
翻译:数字信息渠道和出版竞赛的指数增长使科学主张的错误信息比以往更加普遍。然而,事实上核实某一科学主张的任务即使对研究人员来说也不是直截了当的。科学主张的核查需要来自领域专家的深入知识和大量人力,以证实和反驳来自可信科学来源的证据。SciFact数据集和相应的任务为社区提供了一个基准引导板,以便通过提取和吸收来源摘要的相关证据理由来开发自动科学主张核查系统。在这项工作中,我们建议采用模块化方法,对SciFact领导板中的每个预测子任务进行二元分类。我们基于简单分类的方法利用了减少的抽象表述来检索相关的摘要。这些都进一步用于培训相关的理由选择模型。最后,我们进行两步制预测,首先区分非相关的理由,然后找出支持或反驳某一索赔的理由。实验性地,我们的系统RerrFact对Sact进行不作任何微调、简单设计,并在领导板上以竞争方式对可资利用的Frab/Reqrmas 模型。