Learning low-dimensional embeddings of knowledge graphs is a powerful approach used to predict unobserved or missing edges between entities. However, an open challenge in this area is developing techniques that can go beyond simple edge prediction and handle more complex logical queries, which might involve multiple unobserved edges, entities, and variables. For instance, given an incomplete biological knowledge graph, we might want to predict "em what drugs are likely to target proteins involved with both diseases X and Y?" -- a query that requires reasoning about all possible proteins that {\em might} interact with diseases X and Y. Here we introduce a framework to efficiently make predictions about conjunctive logical queries -- a flexible but tractable subset of first-order logic -- on incomplete knowledge graphs. In our approach, we embed graph nodes in a low-dimensional space and represent logical operators as learned geometric operations (e.g., translation, rotation) in this embedding space. By performing logical operations within a low-dimensional embedding space, our approach achieves a time complexity that is linear in the number of query variables, compared to the exponential complexity required by a naive enumeration-based approach. We demonstrate the utility of this framework in two application studies on real-world datasets with millions of relations: predicting logical relationships in a network of drug-gene-disease interactions and in a graph-based representation of social interactions derived from a popular web forum.
翻译:知识图中学习低维嵌入知识图是一个强有力的方法,用来预测各实体之间未观测到或缺失的边缘。然而,这一领域的一个公开挑战是开发各种技术,这些技术可以超越简单的边缘预测,处理更复杂的逻辑问题,其中可能涉及多个未观测的边缘、实体和变量。例如,如果生物知识图不完全,我们可能想要预测“哪些药物可能针对与X和Y疾病有关的蛋白质?” -- -- 这个问题需要对所有可能的蛋白质进行推理,而这些蛋白质可能与X和Y疾病发生相互作用。这里我们引入了一个框架,以便有效地预测综合逻辑查询 -- -- 一个灵活但可移植的第一阶逻辑子集 -- -- 于不完整的知识图中。在我们的方法中,我们将图形节点嵌入一个低维空间,并代表逻辑操作者作为在这个嵌入空间中学习的几何操作(例如,翻译和旋转)。通过在低维度嵌入空间中进行逻辑操作,我们的方法在查询变量的数量方面实现了时间的复杂性,而与基于天性统计的社交图的逻辑图式互动关系中,我们展示了这种逻辑网络关系中的两种逻辑模型的实用性。我们展示了这个逻辑关系中用于。