Ontology alignment (a.k.a ontology matching (OM)) plays a critical role in knowledge integration. Owing to the success of machine learning in many domains, it has been applied in OM. However, the existing methods, which often adopt ad-hoc feature engineering or non-contextual word embeddings, have not yet outperformed rule-based systems especially in an unsupervised setting. In this paper, we propose a novel OM system named BERTMap which can support both unsupervised and semi-supervised settings. It first predicts mappings using a classifier based on fine-tuning the contextual embedding model BERT on text semantics corpora extracted from ontologies, and then refines the mappings through extension and repair by utilizing the ontology structure and logic. Our evaluation with three alignment tasks on biomedical ontologies demonstrates that BERTMap can often perform better than the leading OM systems LogMap and AML.
翻译:在知识整合方面,由于机器学习在许多领域的成功,在OM中应用了这一方法。然而,现有的方法往往采用特别特征工程或非文字嵌入,但还没有超过基于规则的系统,特别是在无人监督的环境中。我们在本文件中提议建立一个名为BERTMap的新颖的OM系统,它既可以支持不受监督和半监督的设置。它首先预测如何使用基于精细调整从理论中提取的文字语义嵌入模型BERT的分类器进行绘图,然后通过扩展和修补,利用文体结构和逻辑进行。我们用关于生物医学理论的三项调整任务进行的评估表明,BERTMap往往比主要的OM系统LogMap和AML(AML)运行得更好。