Automating ontology construction and curation is an important but challenging task in knowledge engineering and artificial intelligence. Prediction by machine learning techniques such as contextual semantic embedding is a promising direction, but the relevant research is still preliminary especially for expressive ontologies in Web Ontology Language (OWL). In this paper, we present a new subsumption prediction method named BERTSubs for classes of OWL ontology. It exploits the pre-trained language model BERT to compute contextual embeddings of a class, where customized templates are proposed to incorporate the class context (e.g., neighbouring classes) and the logical existential restriction. BERTSubs is quite general, being able to predict multiple kinds of subsumers including named classes and existential restrictions from the same ontology or another ontology. Extensive evaluation on five real-world ontologies for three different subsumption tasks has shown the effectiveness of the templates and that BERTSubs can dramatically outperform the baselines that use (literal-aware) knowledge graph embeddings, non-contextual word embeddings and the state-of-the-art OWL ontology embeddings.
翻译:在知识工程和人工智能方面,通过环境语义嵌入等机器学习技术的预测是一个很有希望的方向,但相关的研究仍然是初步的,特别是针对网络本体语言(OWL)中的表达式肿瘤学。在本文中,我们为OWL肿瘤学的类别提出了一个名为 BERTSubs 的新的子假设预测方法。它利用预先培训的语言模型BERT来计算一个班级的背景嵌入,在这个班级中,建议采用定制的模板以纳入阶级背景(例如相邻类)和逻辑存在限制。BERTESubs相当普遍,能够预测多种子umers,包括命名的类别和同一本体或其它本体学的存在限制。对五个真实世界的三种子集成任务进行了广泛的评估,展示了模板的有效性,并且BERTESub可以大大超出使用(液态)知识图形嵌入、非文字嵌入式词嵌入式和状态式OW-LOF-LOF-OD-LOD-OD-LOD-OD-LOD-LOD-OD-INS-INS-IG-IG-IG-IG-ID-INS-IG-INS-INS-INS-I)的基线。