Many Question-Answering (QA) datasets contain unanswerable questions, but their treatment in QA systems remains primitive. Our analysis of the Natural Questions (Kwiatkowski et al. 2019) dataset reveals that a substantial portion of unanswerable questions ($\sim$21%) can be explained based on the presence of unverifiable presuppositions. We discuss the shortcomings of current models in handling such questions, and describe how an improved system could handle them. Through a user preference study, we demonstrate that the oracle behavior of our proposed system that provides responses based on presupposition failure is preferred over the oracle behavior of existing QA systems. Then we discuss how our proposed system could be implemented, presenting a novel framework that breaks down the problem into three steps: presupposition generation, presupposition verification and explanation generation. We report our progress in tackling each subproblem, and present a preliminary approach to integrating these steps into an existing QA system. We find that adding presuppositions and their verifiability to an existing model yields modest gains in downstream performance and unanswerability detection. The biggest bottleneck is the verification component, which needs to be substantially improved for the integrated system to approach ideal behavior -- even transfer from the best entailment models currently falls short.
翻译:许多问题解答(QA)数据集包含无法解答的问题,但它们在QA系统中的处理仍然原始。我们对自然问题的分析(Kwiatkowski等人,2019年)数据集显示,很大一部分无法解答的问题(21%)可以基于存在无法核实的预估来解释。我们讨论了目前处理这类问题的模型的缺点,并描述了改进的系统如何处理这些问题。通过用户偏好研究,我们证明,我们提出的基于预估失败提供答复的系统,优于现有的QA系统。然后我们讨论如何实施我们提议的系统,提出一个新的框架,将问题分为三个步骤:预设生成、预设核查和解释生成。我们报告了我们处理这些子问题的进展,并提出了将这些步骤纳入现有QA系统的初步办法。我们发现,在现有的模型中添加预设和可核实性,使现有的模型取得适度的成绩,甚至使下游系统获得目前最起码的绩效和最不完善的版本。我们发现,最佳的版本是从下游系统到最先进的版本。