Entity Resolution (ER) is a constitutional part for integrating different knowledge graphs in order to identify entities referring to the same real-world object. A promising approach is the use of graph embeddings for ER in order to determine the similarity of entities based on the similarity of their graph neighborhood. The similarity computations for such embeddings translates to calculating the distance between them in the embedding space which is comparatively simple. However, previous work has shown that the use of graph embeddings alone is not sufficient to achieve high ER quality. We therefore propose a more comprehensive ER approach for knowledge graphs called EAGER (Embedding-Assisted Knowledge Graph Entity Resolution) to flexibly utilize both the similarity of graph embeddings and attribute values within a supervised machine learning approach. We evaluate our approach on 23 benchmark datasets with differently sized and structured knowledge graphs and use hypothesis tests to ensure statistical significance of our results. Furthermore we compare our approach with state-of-the-art ER solutions, where our approach yields competitive results for table-oriented ER problems and shallow knowledge graphs but much better results for deeper knowledge graphs.
翻译:实体分辨率(ER)是整合不同知识图形的宪法部分,以便识别指向同一现实世界对象的实体。一种有希望的方法是使用ER的图形嵌入器,以确定基于其图形相近性的实体的相似性。这种嵌入的相似性计算方法可以用来计算它们之间在相对简单的嵌入空间中的距离。然而,以往的工作表明,单用图形嵌入器不足以实现高ER质量。因此,我们建议对称为EAGER(Embed-Asisticed Knowledge Stuble Interstity Result)的知识图形采用更全面的ER方法,以便灵活地利用图表嵌入和属性值在受监督的机器学习方法中的相似性。我们用不同规模和结构的知识图表对23个基准数据集进行评估,并使用假设测试来确保我们结果的统计意义。此外,我们比较了我们的方法与最先进的ER解决方案,我们的方法为面向表格的问题和浅浅知识图形带来竞争性结果,但更深入的知识图表则产生更好的结果。