Existing datasets that contain boolean questions, such as BoolQ and TYDI QA , provide the user with a YES/NO response to the question. However, a one word response is not sufficient for an explainable system. We promote explainability by releasing a new set of annotations marking the evidence in existing TyDi QA and BoolQ datasets. We show that our annotations can be used to train a model that extracts improved evidence spans compared to models that rely on existing resources. We confirm our findings with a user study which shows that our extracted evidence spans enhance the user experience. We also provide further insight into the challenges of answering boolean questions, such as passages containing conflicting YES and NO answers, and varying degrees of relevance of the predicted evidence.
翻译:包含布尔Q 和 TYDI QA 等布尔语问题的现有数据集为用户提供了“是/否”的答复。 但是,一个单词的答复不足以解释一个系统。 我们通过在现有的Tydi QA 和 BolQ 数据集中发布一套新的说明,标记证据,促进解释性。 我们显示,我们的说明可以用来培训一种模型,与依赖现有资源的模型相比,可以抽出更好的证据。 我们通过用户研究确认我们的调查结果,表明我们提取的证据可以加强用户的经验。 我们还对回答布林恩问题的挑战,例如含有相互矛盾的“是”和“否”答案的段落,以及预测证据的不同相关性,提供了进一步的见解。