In recent years, there have been amazing advances in deep learning methods for machine reading. In machine reading, the machine reader has to extract the answer from the given ground truth paragraph. Recently, the state-of-the-art machine reading models achieve human level performance in SQuAD which is a reading comprehension-style question answering (QA) task. The success of machine reading has inspired researchers to combine information retrieval with machine reading to tackle open-domain QA. However, these systems perform poorly compared to reading comprehension-style QA because it is difficult to retrieve the pieces of paragraphs that contain the answer to the question. In this study, we propose two neural network rankers that assign scores to different passages based on their likelihood of containing the answer to a given question. Additionally, we analyze the relative importance of semantic similarity and word level relevance matching in open-domain QA.
翻译:近年来,在机器阅读的深层学习方法方面取得了惊人的进展。 在机器阅读中,机器阅读者必须从给定的地面真理段落中提取答案。 最近,最先进的机器阅读模型在SQAD中实现了人的水平表现,这是阅读理解式问答(QA)的任务。机器阅读的成功激励了研究人员将信息检索与机器阅读结合起来,以解决开放式的QA问题。然而,这些系统与阅读理解式QA相比表现不佳,因为很难检索含有问题答案的段落。在本研究中,我们建议了两个神经网络排行榜,根据它们包含对某个问题的答案的可能性,为不同的段落分配分数。此外,我们分析了在开放域 QA 中,语义相似性和字级相关性相匹配的相对重要性。