Neural network models recently proposed for question answering (QA) primarily focus on capturing the passage-question relation. However, they have minimal capability to link relevant facts distributed across multiple sentences which is crucial in achieving deeper understanding, such as performing multi-sentence reasoning, co-reference resolution, etc. They also do not explicitly focus on the question and answer type which often plays a critical role in QA. In this paper, we propose a novel end-to-end question-focused multi-factor attention network for answer extraction. Multi-factor attentive encoding using tensor-based transformation aggregates meaningful facts even when they are located in multiple sentences. To implicitly infer the answer type, we also propose a max-attentional question aggregation mechanism to encode a question vector based on the important words in a question. During prediction, we incorporate sequence-level encoding of the first wh-word and its immediately following word as an additional source of question type information. Our proposed model achieves significant improvements over the best prior state-of-the-art results on three large-scale challenging QA datasets, namely NewsQA, TriviaQA, and SearchQA.
翻译:最近为问答而提出的神经网络模型主要侧重于捕捉通道问题的关系,然而,这些模型在将多个句子中分布的相关事实联系起来方面能力有限,而这些事实对于达成更深入的理解至关重要,例如进行多句判决推理、共同参考分辨率等。这些模型也没有明确地侧重于在问答中往往发挥关键作用的问答类型。在本文件中,我们提议建立一个新型的端对端以问题为重点的多要素关注网络,以便解答。多因素使用基于多元变换的汇总数据来关注有意义的事实,即使它们位于多个句子中。为了暗示答案类型,我们还提议一个最大有意问题汇总机制,以便根据问题中的重要词对问题矢量进行编码。在预测期间,我们将第一个wh字及其后继词的顺序编码作为另一个问题类型信息来源。我们提议的模型在三个大规模挑战性QA数据集,即NewsQA、TriviaQA和SearchQA上的最佳前状态结果上取得了重大改进。