Multi-modal entity alignment aims to identify equivalent entities between two different multi-modal knowledge graphs, which consist of structural triples and images associated with entities. Most previous works focus on how to utilize and encode information from different modalities, while it is not trivial to leverage multi-modal knowledge in entity alignment because of the modality heterogeneity. In this paper, we propose MCLEA, a Multi-modal Contrastive Learning based Entity Alignment model, to obtain effective joint representations for multi-modal entity alignment. Different from previous works, MCLEA considers task-oriented modality and models the inter-modal relationships for each entity representation. In particular, MCLEA firstly learns multiple individual representations from multiple modalities, and then performs contrastive learning to jointly model intra-modal and inter-modal interactions. Extensive experimental results show that MCLEA outperforms state-of-the-art baselines on public datasets under both supervised and unsupervised settings.
翻译:多模式实体调整的目的是在两种不同的多模式知识图表(包括结构三重图和与实体相关的图像)之间确定等同实体,以往的工作大多侧重于如何利用和编码不同模式的信息,而由于模式差异性,在实体调整中利用多模式知识并非无关紧要;在本文件中,我们提议以多模式差异学习为基础的多模式学习实体调整模式MCLEA,以获得多模式实体调整的有效联合表述。不同于以往的工作,MCLEA考虑的是面向任务的模式和每个实体代表模式之间的模式关系模式。特别是,MCLEA首先从多种模式中学习多种个人代表,然后进行对比学习,以联合模式进行内部和模式互动。广泛的实验结果表明,MCLEA在受监管和未受监督的环境中,都超越了公共数据集方面的最新基线。