Neural link predictors are immensely useful for identifying missing edges in large scale Knowledge Graphs. However, it is still not clear how to use these models for answering more complex queries that arise in a number of domains, such as queries using logical conjunctions ($\land$), disjunctions ($\lor$) and existential quantifiers ($\exists$), while accounting for missing edges. In this work, we propose a framework for efficiently answering complex queries on incomplete Knowledge Graphs. We translate each query into an end-to-end differentiable objective, where the truth value of each atom is computed by a pre-trained neural link predictor. We then analyse two solutions to the optimisation problem, including gradient-based and combinatorial search. In our experiments, the proposed approach produces more accurate results than state-of-the-art methods -- black-box neural models trained on millions of generated queries -- without the need of training on a large and diverse set of complex queries. Using orders of magnitude less training data, we obtain relative improvements ranging from 8% up to 40% in Hits@3 across different knowledge graphs containing factual information. Finally, we demonstrate that it is possible to explain the outcome of our model in terms of the intermediate solutions identified for each of the complex query atoms. All our source code and datasets are available online, at https://github.com/uclnlp/cqd.
翻译:神经链接预测器对于在大规模知识图表中识别缺失的边缘非常有用。 然而, 仍然不清楚如何使用这些模型来回答许多领域出现的更为复杂的询问, 例如使用逻辑连接( land$ ) 、 脱钩( loor$) 和存在量化符( exploitation) 等查询, 来计算缺失的边缘 。 在这项工作中, 我们提出了一个框架, 以高效的方式回答关于不完整知识图表的复杂查询。 我们把每个查询转换成一个端到端到端的不同目标, 其中每个原子的真相值由事先训练的神经链接预测器计算。 我们然后分析两种选择化问题的解决办法, 包括基于梯度的搜索和组合搜索。 在我们的实验中, 提议的方法产生比状态匹配法更准确的结果 -- 黑箱神经模型, 以数百万种生成的查询方式进行训练 -- 不需要对大型和多样化的复杂查询进行训练。 使用数量更少的培训数据, 我们从8%到40%不等的网络模型的相对改进范围, 跨不同知识图表的查点图中, 我们的每个数据都用到包含事实信息的源。 最后, 我们的每个解源 。