In open-domain question answering, questions are highly likely to be ambiguous because users may not know the scope of relevant topics when formulating them. Therefore, a system needs to find every possible interpretation of the question, and propose a set of disambiguated question-answer pairs. In this paper, we present a model that aggregates and combines evidence from multiple passages to generate question-answer pairs. Particularly, our model reads a large number of passages to find as many interpretations as possible. In addition, we propose a novel round-trip prediction approach to generate additional interpretations that our model fails to find in the first pass, and then verify and filter out the incorrect question-answer pairs to arrive at the final disambiguated output. On the recently introduced AmbigQA open-domain question answering dataset, our model, named Refuel, achieves a new state-of-the-art, outperforming the previous best model by a large margin. We also conduct comprehensive analyses to validate the effectiveness of our proposed round-trip prediction.
翻译:在开放式解答中,问题极有可能含糊不清,因为用户可能不知道相关专题的范围。因此,一个系统需要找到对问题的每一种可能解释,并提出一套自相矛盾的问答配对。在本文中,我们提出了一个模型,汇总和综合多个段落的证据,以产生问答配对。特别是,我们的模型读了许多段落,以找到尽可能多的解释。此外,我们提出了一个新的圆柱预测方法,以产生我们模型在第一关找不到的更多解释,然后核查和过滤错误的问答配对,以得出最后断裂的结果。关于最近推出的AmbigQA开放域回答数据集的问题,我们称为Refue的模型取得了新的状态,大大超过了以往的最佳模型。我们还进行了全面分析,以验证我们拟议的圆柱预测的有效性。