With the explosive growth of scientific publications, making the synthesis of scientific knowledge and fact checking becomes an increasingly complex task. In this paper, we propose a multi-task approach for verifying the scientific questions based on a joint reasoning from facts and evidence in research articles. We propose an intelligent combination of (1) an automatic information summarization and (2) a Boolean Question Answering which allows to generate an answer to a scientific question from only extracts obtained after summarization. Thus on a given topic, our proposed approach conducts structured content modeling based on paper abstracts to answer a scientific question while highlighting texts from paper that discuss the topic. We based our final system on an end-to-end Extractive Question Answering (EQA) combined with a three outputs classification model to perform in-depth semantic understanding of a question to illustrate the aggregation of multiple responses. With our light and fast proposed architecture, we achieved an average error rate of 4% and a F1-score of 95.6%. Our results are supported via experiments with two QA models (BERT, RoBERTa) over 3 Million Open Access (OA) articles in the medical and health domains on Europe PMC.
翻译:随着科学出版物的爆炸性增长,科学知识和事实检查的合成变得日益复杂。在本文件中,我们建议采取多任务方法,根据研究文章中的事实和证据的联合推理来核查科学问题。我们提议将(1) 自动信息汇总和(2) 布尔兰问题回答进行明智的组合,这样就可以从汇总后仅提取的摘录中找到科学问题的答案。因此,就一个特定议题而言,我们提议的方法以文件摘要为基础,结构化内容模型,回答一个科学问题,同时突出讨论该议题的文件中的案文。我们把最终系统建立在最终到终端的提取问题回答(EQA)基础上,并结合三种产出分类模式,对一个问题进行深入的语义化理解,以说明多重答复的汇总情况。我们以光和快速的拟议结构,实现了4%的平均误差率和95.6%的F1核心。我们的成果通过欧洲医学和卫生领域的两种QA模型(BERTA, RoBERTA)的实验得到支持。