The AI2 Reasoning Challenge (ARC), a new benchmark dataset for question answering (QA) has been recently released. ARC only contains natural science questions authored for human exams, which are hard to answer and require advanced logic reasoning. On the ARC Challenge Set, existing state-of-the-art QA systems fail to significantly outperform random baseline, reflecting the difficult nature of this task. In this paper, we propose a novel framework for answering science exam questions, which mimics human solving process in an open-book exam. To address the reasoning challenge, we construct contextual knowledge graphs respectively for the question itself and supporting sentences. Our model learns to reason with neural embeddings of both knowledge graphs. Experiments on the ARC Challenge Set show that our model outperforms the previous state-of-the-art QA systems.
翻译:AI2 解释挑战(ARC) 是一个用于回答问题的新基准数据集(QA) 已于最近发布。 ARC 仅包含为人类考试编写的自然科学问题,很难回答,需要先进的逻辑推理。在 ARC 挑战组上,现有最先进的QA 系统没有显著地超过随机基准,反映了这项任务的困难性质。在本文中,我们提出了一个用于回答科学考试问题的新框架,它模仿了在公开书籍考试中进行人类解答的过程。为了应对推理挑战,我们为问题本身和辅助句子分别构建了背景知识图表。我们的模型学习了两种知识图的神经嵌入。关于ARC 挑战组的实验显示,我们的模型超越了以前最先进的QA系统。