Retrieving information from correlative paragraphs or documents to answer open-domain multi-hop questions is very challenging. To deal with this challenge, most of the existing works consider paragraphs as nodes in a graph and propose graph-based methods to retrieve them. However, in this paper, we point out the intrinsic defect of such methods. Instead, we propose a new architecture that models paragraphs as sequential data and considers multi-hop information retrieval as a kind of sequence labeling task. Specifically, we design a rewritable external memory to model the dependency among paragraphs. Moreover, a threshold gate mechanism is proposed to eliminate the distraction of noise paragraphs. We evaluate our method on both full wiki and distractor subtask of HotpotQA, a public textual multi-hop QA dataset requiring multi-hop information retrieval. Experiments show that our method achieves significant improvement over the published state-of-the-art method in retrieval and downstream QA task performance.
翻译:从相关段落或文件中检索信息,以解答开放式域多动问题非常困难。为了应对这一挑战,大多数现有作品将段落视为图表中的节点,并提出以图表为基础的检索方法。然而,在本文件中,我们指出这些方法的内在缺陷。相反,我们提出了一个新的架构,将段落建模为相继数据,并将多动点信息检索视为一种排序标签任务。具体地说,我们设计了一个可重写的外部记忆,以模拟段落之间的依赖性。此外,还提议了一个门槛大门机制,以消除噪音段落的分散。我们评价了我们关于热波特QA(需要多动信息检索的公共文本多动点 QA数据集)的全维基和分散源子任务的方法。实验表明,我们的方法在检索和下游QA任务性能方面已经公布的最新方法取得了重大改进。