This paper studies a new problem setting of entity alignment for knowledge graphs (KGs). Since KGs possess different sets of entities, there could be entities that cannot find alignment across them, leading to the problem of dangling entities. As the first attempt to this problem, we construct a new dataset and design a multi-task learning framework for both entity alignment and dangling entity detection. The framework can opt to abstain from predicting alignment for the detected dangling entities. We propose three techniques for dangling entity detection that are based on the distribution of nearest-neighbor distances, i.e., nearest neighbor classification, marginal ranking and background ranking. After detecting and removing dangling entities, an incorporated entity alignment model in our framework can provide more robust alignment for remaining entities. Comprehensive experiments and analyses demonstrate the effectiveness of our framework. We further discover that the dangling entity detection module can, in turn, improve alignment learning and the final performance. The contributed resource is publicly available to foster further research.
翻译:由于 KGs 拥有不同的实体, 可能会有无法找到它们之间的对齐, 从而导致相交实体问题。 作为第一个尝试, 我们为这个问题建立了一个新的数据集, 为实体对齐和对交错实体的检测设计了一个多任务学习框架。 这个框架可以选择不预测被检测到的相交实体的对齐。 我们提出了三种对相交实体的检测技术, 其依据是近邻距离的分布, 即近邻的分类、 边际排名和背景排名。 在发现和清除相交实体后, 我们框架中的一个一体化实体对齐模式可以为剩余实体提供更强有力的对齐。 全面实验和分析证明了我们框架的有效性。 我们进一步发现, 交错实体的检测模块可以反过来改进对接学习和最终性能。 贡献的资源可以公开用于促进进一步研究。