Entity alignment (EA) aims to discover the equivalent entities in different knowledge graphs (KGs), which play an important role in knowledge engineering. Recently, EA with dangling entities has been proposed as a more realistic setting, which assumes that not all entities have corresponding equivalent entities. In this paper, we focus on this setting. Some work has explored this problem by leveraging translation API, pre-trained word embeddings, and other off-the-shelf tools. However, these approaches over-rely on the side information (e.g., entity names), and fail to work when the side information is absent. On the contrary, they still insufficiently exploit the most fundamental graph structure information in KG. To improve the exploitation of the structural information, we propose a novel entity alignment framework called Weakly-Optimal Graph Contrastive Learning (WOGCL), which is refined on three dimensions : (i) Model. We propose a novel Gated Graph Attention Network to capture local and global graph structure similarity. (ii) Training. Two learning objectives: contrastive learning and optimal transport learning are designed to obtain distinguishable entity representations via the optimal transport plan. (iii) Inference. In the inference phase, a PageRank-based method is proposed to calculate higher-order structural similarity. Extensive experiments on two dangling benchmarks demonstrate that our WOGCL outperforms the current state-of-the-art methods with pure structural information in both traditional (relaxed) and dangling (consolidated) settings. The code will be public soon.
翻译:实体对齐 (EA) 是指在不同知识图谱 (KGs) 中发现等价实体的过程,它在知识工程中扮演着重要角色。最近,包括Dangling实体在内的EA被提出作为更为现实的情形,它假设并非所有实体都有对应的等价实体。在本文中,我们着重探讨这一问题。一些研究通过运用翻译API、预先训练的词嵌入和其他成熟的工具来探讨这个问题。然而,这些方法过分依赖于辅助信息 (例如实体名称),在缺乏辅助信息的情况下无法奏效。相反的,它们仍然未充分利用KGs中最基本的图结构信息。为了提高对结构信息的利用,我们提出了一种名为弱优-graph对比学习 (WOGCL) 的实体对齐框架,其在以下三个方面得到了改进 : (i) 模型。我们提出了一种新的带门控图注意力网络,以捕捉局部和全局图结构相似性。 (ii) 训练。我们设计了两个学习目标 : 对比学习和最优输运学习,通过最优输运计划获得可区分实体表示。 (iii) 推断。在推断阶段,我们提出了一种基于PageRank的方法来计算更高阶的结构相似性。对于两个Dangling基准的广泛实验表明,我们的WOGCL在传统(relaxed)和Dangling(consolidated)设置下均胜过当前的最先进方法,仅使用纯结构信息。代码将很快公开。