Entity Alignment (EA) aims to match equivalent entities that refer to the same real-world objects and is a key step for Knowledge Graph (KG) fusion. Most neural EA models cannot be applied to large-scale real-life KGs due to their excessive consumption of GPU memory and time. One promising solution is to divide a large EA task into several subtasks such that each subtask only needs to match two small subgraphs of the original KGs. However, it is challenging to divide the EA task without losing effectiveness. Existing methods display low coverage of potential mappings, insufficient evidence in context graphs, and largely differing subtask sizes. In this work, we design the DivEA framework for large-scale EA with high-quality task division. To include in the EA subtasks a high proportion of the potential mappings originally present in the large EA task, we devise a counterpart discovery method that exploits the locality principle of the EA task and the power of trained EA models. Unique to our counterpart discovery method is the explicit modelling of the chance of a potential mapping. We also introduce an evidence passing mechanism to quantify the informativeness of context entities and find the most informative context graphs with flexible control of the subtask size. Extensive experiments show that DivEA achieves higher EA performance than alternative state-of-the-art solutions.
翻译:实体对齐(EA) 旨在匹配指同一种真实世界天体的同等实体,并且是知识图(KG)融合的关键步骤。大多数神经EA模型由于过度消耗 GPU 内存和时间,无法应用于大规模实际KG。一个有希望的解决办法是将一个大型EA任务分为几个子任务,这样每个子任务只需匹配原始KG的两个小子任务即可。然而,在不丧失效力的情况下划分EA任务是困难的。现有方法显示潜在绘图覆盖面低、上下文图中证据不足以及子任务大小大不相同的子任务。在这项工作中,我们为大型EA设计了具有高质量任务分工的大型实际 KGS框架。要将最初在EPO任务中存在的潜在绘图比例高,我们设计了一个对应发现方法,利用EA任务的地点原则和经过培训的EA模型的力量。与我们对应的发现方法是对潜在制图机会进行明确的模拟。我们还引入了一种通过证据的通过机制,用高质量任务分工来量化大规模EA 高级的图像背景,并用最灵活的方式显示EA 的图像背景,显示最高级的图像大小。