We propose a new model, DocHopper, that iteratively attends to different parts of long, hierarchically structured documents to answer complex questions. Similar to multi-hop question-answering (QA) systems, at each step, DocHopper uses a query $q$ to attend to information from a document, combines this ``retrieved'' information with $q$ to produce the next query. However, in contrast to most previous multi-hop QA systems, DocHopper is able to ``retrieve'' either short passages or long sections of the document, thus emulating a multi-step process of ``navigating'' through a long document to answer a question. To enable this novel behavior, DocHopper does not combine document information with $q$ by concatenating text to the text of $q$, but by combining a compact neural representation of $q$ with a compact neural representation of a hierarchical part of the document, which can potentially be quite large. We experiment with DocHopper on four different QA tasks that require reading long and complex documents to answer multi-hop questions, and show that DocHopper achieves state-of-the-art results on three of the datasets. Additionally, DocHopper is efficient at inference time, being 3--10 times faster than the baselines.
翻译:我们提出了一个新的模式,即“多功能”文件,它反复处理长篇、分层次结构的文件的不同部分,以回答复杂的问题。与多动问答(QA)系统类似的是,多动问答(QA)系统,每步都使用一个查询$q$来处理来自文件的信息,将“检索”信息与$q$相结合,以产生下一个查询。然而,与大多数以往的多动问答(QA)系统相比, DocHopper能够“探索”文件的短段或长段,从而通过长篇文件模拟一个多步的“导航”进程来回答一个问题。为使这种新颖的行为得以使用一个查询$q$来处理, Dochopper不会将文件信息与$$的文本和$q$结合起来,而是将美元和美元的一个紧凑的神经代表组合起来,它可能是相当大的。我们与Dochopper实验了四种不同的QA任务,需要阅读长而复杂的文件,需要阅读长而复杂的多动的文件,在多动文件的基线上显示一个快速的数据。