Cross-lingual and cross-domain knowledge alignment without sufficient external resources is a fundamental and crucial task for fusing irregular data. As the element-wise fusion process aiming to discover equivalent objects from different knowledge graphs (KGs), entity alignment (EA) has been attracting great interest from industry and academic research recent years. Most of existing EA methods usually explore the correlation between entities and relations through neighbor nodes, structural information and external resources. However, the complex intrinsic interactions among triple elements and role information are rarely modeled in these methods, which may lead to the inadequate illustration for triple. In addition, external resources are usually unavailable in some scenarios especially cross-lingual and cross-domain applications, which reflects the little scalability of these methods. To tackle the above insufficiency, a novel universal EA framework (OTIEA) based on ontology pair and role enhancement mechanism via triple-aware attention is proposed in this paper without introducing external resources. Specifically, an ontology-enhanced triple encoder is designed via mining intrinsic correlations and ontology pair information instead of independent elements. In addition, the EA-oriented representations can be obtained in triple-aware entity decoder by fusing role diversity. Finally, a bidirectional iterative alignment strategy is deployed to expand seed entity pairs. The experimental results on three real-world datasets show that our framework achieves a competitive performance compared with baselines.
翻译:跨语言和跨领域知识对齐是将不规则数据融合的基础和关键任务之一,尤其是在缺乏充足外部资源的情况下。作为元素融合过程的一种,实体对齐(EA)一直是引起工业和学术研究极大兴趣的一个领域。大部分现有的EA方法通常通过邻居节点、结构信息和外部资源来探索实体和关系之间的关联。然而,这些方法中很少对三元组的复杂内在交互和角色信息进行建模,这可能会导致三元组的不充分描述。此外,在某些场景,特别是跨语言和跨领域应用,通常无法获得外部资源,这反映了这些方法的可伸缩性不够。为了解决上述不足,本文提出了一种新颖的、基于本体对和三元组的角色增强机制,通过三元组感知注意力,提出了一种无需引入外部资源的通用EA框架(OTIEA)。具体来说,我们设计了一种本体增强三元组编码器,通过挖掘内在关联和本体对信息来替代独立元素。此外,EA导向的表示可以通过融合角色多样性来获得三元组感知实体解码器。最后,我们采用双向迭代对齐策略来扩展种子实体对。实验结果表明,与基线方法相比,我们的框架在三个真实数据集上实现了竞争性的性能。