Entity alignment (EA) merges knowledge graphs (KGs) by identifying the equivalent entities in different graphs, which can effectively enrich knowledge representations of KGs. However, in practice, different KGs often include dangling entities whose counterparts cannot be found in the other graph, which limits the performance of EA methods. To improve EA with dangling entities, we propose an unsupervised method called Semi-constraint Optimal Transport for Entity Alignment in Dangling cases (SoTead). Our main idea is to model the entity alignment between two KGs as an optimal transport problem from one KG's entities to the others. First, we set pseudo entity pairs between KGs based on pretrained word embeddings. Then, we conduct contrastive metric learning to obtain the transport cost between each entity pair. Finally, we introduce a virtual entity for each KG to "align" the dangling entities from the other KGs, which relaxes the optimization constraints and leads to a semi-constraint optimal transport. In the experimental part, we first show the superiority of SoTead on a commonly-used entity alignment dataset. Besides, to analyze the ability for dangling entity detection with other baselines, we construct a medical cross-lingual knowledge graph dataset, MedED, where our SoTead also reaches state-of-the-art performance.
翻译:实体对齐( EA) 将知识图形( KGs) 合并起来, 通过在不同的图表中确定等效实体, 从而有效地丰富 KGs的知识表现。 但是, 在实践中, 不同的 KGs 通常包括在另一图表中找不到对应方的相交实体, 从而限制EA方法的性能。 为了改进EA 与相交实体的性能, 我们提议了一个未经监督的方法, 称为半约束最佳运输工具, 用于实体对齐( SoTead) 。 我们的主要想法是模拟两个KGs 之间的实体对齐, 以此作为从一个 KG 的实体到另一个其他实体的最佳运输问题。 首先, 我们在KGs 之间设置假实体对配对对, 以预先训练的字嵌入为主, 然后我们进行对比性指标学习, 以获得每个实体对配方的运输成本。 最后, 我们为每个KG 引入一个虚拟实体“ 调整” 与其它 KGs 的相交错实体, 放松优化限制, 并导致半约束性运输最佳运输。 在实验部分, 我们首先展示Steadadaddad的优势 在共同测试实体对齐标准数据, 。 此外, 我们还分析我们共同使用的标准测试标准数据, 。