When semantically describing knowledge graphs (KGs), users have to make a critical choice of a vocabulary (i.e. predicates and resources). The success of KG building is determined by the convergence of shared vocabularies so that meaning can be established. The typical lifecycle for a new KG construction can be defined as follows: nascent phases of graph construction experience terminology divergence, while later phases of graph construction experience terminology convergence and reuse. In this paper, we describe our approach tailoring two AI-based clustering algorithms for recommending predicates (in RDF statements) about resources in the Open Research Knowledge Graph (ORKG) https://orkg.org/. Such a service to recommend existing predicates to semantify new incoming data of scholarly publications is of paramount importance for fostering terminology convergence in the ORKG. Our experiments show very promising results: a high precision with relatively high recall in linear runtime performance. Furthermore, this work offers novel insights into the predicate groups that automatically accrue loosely as generic semantification patterns for semantification of scholarly knowledge spanning 44 research fields.
翻译:当从字面上描述知识图(KGs)时,用户必须对词汇(即上游和资源)作出关键选择。KG大楼的成功取决于共同词汇的融合,以便确定含义。新的KG建筑的典型生命周期可定义如下:图形建筑的新生阶段,术语差异,而图形建筑的后期阶段,则经历术语的趋同和再利用。在本文中,我们描述了我们的方法,即调整两种基于AI的群集算法,以推荐开放式研究知识图(ORKG) https://orkg.org/ 中的上游(RDF 语句) 的资源。这种服务建议现有上游对学术出版物的新数据进行拼写,对于促进ORKG的术语趋同至关重要。我们的实验显示非常有希望的结果:在线性运行性表现中,高度精确,记得相对较高。此外,这项工作对上游组提供了新的洞察,这些群自动作为分布在44个研究领域的学术知识的精度的精度自然形成一般的分解模式。