In this paper, a new perspective is suggested for unsupervised Ontology Matching (OM) or Ontology Alignment (OA) by treating it as a translation task. Ontologies are represented as graphs, and the translation is performed from a node in the source ontology graph to a path in the target ontology graph. The proposed framework, Truveta Mapper (TM), leverages a multi-task sequence-to-sequence transformer model to perform alignment across multiple ontologies in a zero-shot, unified and end-to-end manner. Multi-tasking enables the model to implicitly learn the relationship between different ontologies via transfer-learning without requiring any explicit cross-ontology manually labeled data. This also enables the formulated framework to outperform existing solutions for both runtime latency and alignment quality. The model is pre-trained and fine-tuned only on publicly available text corpus and inner-ontologies data. The proposed solution outperforms state-of-the-art approaches, Edit-Similarity, LogMap, AML, BERTMap, and the recently presented new OM frameworks in Ontology Alignment Evaluation Initiative (OAEI22), offers log-linear complexity in contrast to quadratic in the existing end-to-end methods, and overall makes the OM task efficient and more straightforward without much post-processing involving mapping extension or mapping repair.
翻译:在本文中,我们建议从翻译任务的角度出发,将无监督的本体匹配(OM)或本体对齐(OA)看作一种新的视角。本体被表示为图形,从源本体图形中的节点到目标本体图形中的路径进行翻译。所提出的框架Truveta Mapper(TM)利用多任务序列到序列变压器模型,以零样本,统一和端到端的方式在多个本体之间执行对齐。多任务使模型能够通过迁移学习隐式地学习不同本体之间的关系,而不需要任何明确的跨本体手动标记的数据。这还使得所构建的框架在运行时间延迟和对齐质量方面优于现有解决方案。模型仅在公开可用的文本语料库和内部本体数据上进行预训练和微调。所提出的解决方案优于最先进的方法,如Edit-Similarity,LogMap,AML,BERTMap以及最近提出的新的本体对齐框架,在Ontology Alignment Evaluation Initiative(OAEI22)中,提供对数级复杂性,与现有的端到端方法二次方差异,并且总体上使OM任务更加高效和简单,而不需要进行太多的后处理,包括映射扩展或映射修复。