Multi-hop Question Answering (QA) is a challenging task since it requires an accurate aggregation of information from multiple context paragraphs and a thorough understanding of the underlying reasoning chains. Recent work in multi-hop QA has shown that performance can be boosted by first decomposing the questions into simpler, single-hop questions. In this paper, we explore one additional utility of the multi-hop decomposition from the perspective of explainable NLP: to create explanation by probing a neural QA model with them. We hypothesize that in doing so, users will be better able to predict when the underlying QA system will give the correct answer. Through human participant studies, we verify that exposing the decomposition probes and answers to the probes to users can increase their ability to predict system performance on a question instance basis. We show that decomposition is an effective form of probing QA systems as well as a promising approach to explanation generation. In-depth analyses show the need for improvements in decomposition systems.
翻译:多跳问题解答(QA)是一项具有挑战性的任务,因为它需要从多个上下文段落中准确汇总信息,并透彻理解基本推理链。多跳QA的近期工作表明,通过首先将问题分解为简单、单跳问题,可以提高性能。在本文中,我们从可解释的NLP的角度探讨多跳分解的另一个有用性:通过与它们探索神经定量解析模型来创造解释。我们假设,在这样做时,用户将能够更好地预测基本QA系统何时会给出正确的答案。通过人类参与者的研究,我们核实向用户披露分解探测器的探测和答案能够提高用户在问题实例基础上预测系统性能的能力。我们表明,解析是一种有效的形式,可以解释QA系统,也可以是解释生成的有希望的方法。深入的分析表明,解析系统需要改进。