Entity alignment seeks to find entities in different knowledge graphs (KGs) that refer to the same real-world object. Recent advancement in KG embedding impels the advent of embedding-based entity alignment, which encodes entities in a continuous embedding space and measures entity similarities based on the learned embeddings. In this paper, we conduct a comprehensive experimental study of this emerging field. We survey 23 recent embedding-based entity alignment approaches and categorize them based on their techniques and characteristics. We also propose a new KG sampling algorithm, with which we generate a set of dedicated benchmark datasets with various heterogeneity and distributions for a realistic evaluation. We develop an open-source library including 12 representative embedding-based entity alignment approaches, and extensively evaluate these approaches, to understand their strengths and limitations. Additionally, for several directions that have not been explored in current approaches, we perform exploratory experiments and report our preliminary findings for future studies. The benchmark datasets, open-source library and experimental results are all accessible online and will be duly maintained.
翻译:实体调整寻求在不同的知识图表中找到指同一种真实世界物体的实体。 KG最近的进展催生了嵌入式实体调整的出现,将实体编码成不断嵌入的空间和根据所学到的嵌入式衡量实体的相似之处。在本文件中,我们对这一新兴领域进行了全面的实验性研究。我们调查了23个最近嵌入式实体调整方法,并根据它们的技术和特点对它们进行了分类。我们还提出了一个新的KG抽样算法,我们据此生成了一套有各种异质和分布的专用基准数据集,以供进行现实的评价。我们开发了一个开放源图书馆,其中包括12个基于代表性嵌入式实体调整方法,并广泛评价这些方法,以了解它们的长处和局限性。此外,对于目前方法中尚未探讨的一些方向,我们进行了探索性实验,并将我们的初步结果报告给今后的研究。基准数据集、开放源图书馆和实验结果都可以在线查阅,并将得到适当维护。