Our goal, in the context of open-domain textual question-answering (QA), is to explain answers by not just listing supporting textual evidence ("rationales"), but also showing how such evidence leads to the answer in a systematic way. If this could be done, new opportunities for understanding and debugging the system's reasoning would become possible. Our approach is to generate explanations in the form of entailment trees, namely a tree of entailment steps from facts that are known, through intermediate conclusions, to the final answer. To train a model with this skill, we created ENTAILMENTBANK, the first dataset to contain multistep entailment trees. At each node in the tree (typically) two or more facts compose together to produce a new conclusion. Given a hypothesis (question + answer), we define three increasingly difficult explanation tasks: generate a valid entailment tree given (a) all relevant sentences (the leaves of the gold entailment tree), (b) all relevant and some irrelevant sentences, or (c) a corpus. We show that a strong language model only partially solves these tasks, and identify several new directions to improve performance. This work is significant as it provides a new type of dataset (multistep entailments) and baselines, offering a new avenue for the community to generate richer, more systematic explanations.
翻译:我们的目标是,在开放的文本解答(QA)背景下,通过不仅列出辅助文本证据(“解释性”)来解释答案,而且还要以系统的方式展示这些证据如何导致答案。如果能够做到这一点,理解和调试系统推理的新机会就会成为可能。我们的做法是以包含树的形式,即从已知的事实中,通过中间结论,从一个必然步骤到最后答案,产生解释。为了用这种技巧训练一个模型,我们建立了第一个包含多步要求树的数据集(ENILAMBANK),即第一个包含多步要求树的数据集。在树的每个节(通常)两个或更多的事实中,形成一个新的结论。根据假设(问题加回答),我们定义了三个日益困难的解释任务:产生一个有效的要求树(黄金树叶),(黄金树叶)所有相关的和一些无关的句子,或(c)一个材料。我们显示一个强大的语言模型只能部分解决这些任务,并且确定若干更进步的路径,以及确定一个更具有系统性的新方向,这为改进业绩提供了重要的基础。