CIMEA:可缩放的实体与斯托卡培训和普通化微型批量相似点的配合 (ClusterEA: Scalable Entity Alignment with Stochastic Training and Normalized Mini-batch Similarities)

Entity alignment (EA) aims at finding equivalent entities in different knowledge graphs (KGs). Embedding-based approaches have dominated the EA task in recent years. Those methods face problems that come from the geometric properties of embedding vectors, including hubness and isolation. To solve these geometric problems, many normalization approaches have been adopted for EA. However, the increasing scale of KGs renders it hard for EA models to adopt the normalization processes, thus limiting their usage in real-world applications. To tackle this challenge, we present ClusterEA, a general framework that is capable of scaling up EA models and enhancing their results by leveraging normalization methods on mini-batches with a high entity equivalent rate. ClusterEA contains three components to align entities between large-scale KGs, including stochastic training, ClusterSampler, and SparseFusion. It first trains a large-scale Siamese GNN for EA in a stochastic fashion to produce entity embeddings. Based on the embeddings, a novel ClusterSampler strategy is proposed for sampling highly overlapped mini-batches. Finally, ClusterEA incorporates SparseFusion, which normalizes local and global similarity and then fuses all similarity matrices to obtain the final similarity matrix. Extensive experiments with real-life datasets on EA benchmarks offer insight into the proposed framework, and suggest that it is capable of outperforming the state-of-the-art scalable EA framework by up to 8 times in terms of Hits@1.

翻译：实体对齐(EA)的目的是在不同的知识图表(KGs)中找到等效实体。基于嵌入式的方法近年来在EA的任务中占主导地位。这些方法面临来自嵌入矢量的几何特性的问题,包括中枢和孤立。为了解决这些几何问题,对EA采取了许多正常化办法。然而,由于KGs规模的扩大,EA模型难以采用正常化进程,从而限制了其在现实世界应用程序中的使用。为了应对这一挑战,我们提出了一个基于嵌入式的总体框架,它能够通过在实体等同率的微型信箱上利用正常化方法扩大EA模型并加强其结果。这些方法面临来自嵌入矢量矢量矢量矢量矢量矢量的几何特性的问题。为了解决这些几何几何问题,对EA采取了许多正常化办法。然而,由于KGMs的日益扩大规模,使得EA模型难以采用大规模Siameese GNNN, 从而难以在现实世界应用程序中使用。基于嵌入式的新型集群SBSamplereralSampler 战略是将高度重叠的缩缩略图框架, 。最后的SlodialalFsalalalimalimalalal ex Flistration Flation Flation Flation Flation Flation Stilations flations flations flations