We study open-domain question answering with structured, unstructured and semi-structured knowledge sources, including text, tables, lists and knowledge bases. Departing from prior work, we propose a unifying approach that homogenizes all sources by reducing them to text and applies the retriever-reader model which has so far been limited to text sources only. Our approach greatly improves the results on knowledge-base QA tasks by 11 points, compared to latest graph-based methods. More importantly, we demonstrate that our unified knowledge (UniK-QA) model is a simple and yet effective way to combine heterogeneous sources of knowledge, advancing the state-of-the-art results on two popular question answering benchmarks, NaturalQuestions and WebQuestions, by 3.5 and 2.6 points, respectively. The code of UniK-QA is available at: https://github.com/facebookresearch/UniK-QA.
翻译:我们用结构化、无结构化和半结构化的知识来源,包括文字、表格、清单和知识基础,研究开放的域问题,我们从先前的工作出发,提出一种统一的方法,将所有来源同化,将所有来源压缩为文字,并采用检索器阅读器模型,该模型迄今仅限于文字来源,与最新的图表方法相比,我们的方法大大改进了知识基础质量评估任务的结果11点,更重要的是,我们证明我们的统一知识(UniK-QA)模型是一个简单而有效的方法,可以将各种知识来源结合起来,在两个常见问题回答基准“自然问题和网络问题”上分别用3.5和2.6点推进最新结果,UniK-QA代码可在以下网址查阅:https://github.com/facebourseearch/UniK-QA。