Question answering over knowledge graphs and other RDF data has been greatly advanced, with a number of good systems providing crisp answers for natural language questions or telegraphic queries. Some of these systems incorporate textual sources as additional evidence for the answering process, but cannot compute answers that are present in text alone. Conversely, systems from the IR and NLP communities have addressed QA over text, but barely utilize semantic data and knowledge. This paper presents the first QA system that can seamlessly operate over RDF datasets and text corpora, or both together, in a unified framework. Our method, called UNIQORN, builds a context graph on the fly, by retrieving question-relevant triples from the RDF data and/or the text corpus, where the latter case is handled by automatic information extraction. The resulting graph is typically rich but highly noisy. UNIQORN copes with this input by advanced graph algorithms for Group Steiner Trees, that identify the best answer candidates in the context graph. Experimental results on several benchmarks of complex questions with multiple entities and relations, show that UNIQORN, an unsupervised method with only five parameters, produces results comparable to the state-of-the-art on KGs, text corpora, and heterogeneous sources. The graph-based methodology provides user-interpretable evidence for the complete answering process.
翻译:对知识图表和其他RDF数据的回答问题得到了很大进展,许多良好的系统为自然语言问题或电报询问提供了精确的答案。有些系统将文本源作为回答过程的补充证据,但无法单独计算文本中存在的答案。相反,IR和NLP社区的系统对QA的文字处理问题,但很少使用语义数据和知识。本文件展示了第一个QA系统,该系统可以在一个统一的框架内对RDF数据集和文本公司进行无缝操作,或两者同时运行。我们称为UNQORN的方法,将文字源作为回答过程的补充证据,将文字源与问题有关的三重数据作为补充,而后者则通过自动信息提取处理。由此产生的图表通常丰富但非常吵闹。UNIQORN用基于GStein Trees的高级图表算法处理这一输入,该图表确定了背景图中的最佳回答对象。在多个实体和关系中的一些复杂问题的基准上,我们称为UNQQORN,通过检索RDFDF数据和/或文本的三重度数据源,显示UNGOFI-CFI-CFIG-CFIG-CFIG-CRUIG-CFIG-G-C-C-C-C-C-C-C-C-CFILOTION-C-C-C-C-IG-C-C-C-C-C-C-C-IG-C-C-C-C-C-C-IG-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-IG-C-C-C-C-C-C-IG-CFAT-CFAT-IG-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-C-I-I-I-I-C-C-C-C-