Scholarly Knowledge Graphs (KGs) provide a rich source of structured information representing knowledge encoded in scientific publications. With the sheer volume of published scientific literature comprising a plethora of inhomogeneous entities and relations to describe scientific concepts, these KGs are inherently incomplete. We present exBERT, a method for leveraging pre-trained transformer language models to perform scholarly knowledge graph completion. We model triples of a knowledge graph as text and perform triple classification (i.e., belongs to KG or not). The evaluation shows that exBERT outperforms other baselines on three scholarly KG completion datasets in the tasks of triple classification, link prediction, and relation prediction. Furthermore, we present two scholarly datasets as resources for the research community, collected from public KGs and online resources.
翻译:学术知识图(KGs)提供了丰富的结构化信息源,代表科学出版物中编码的知识。大量出版的科学文献由大量不相容的实体和描述科学概念的关系组成,这些KGs本质上是不完整的。我们介绍了ExBERT,这是利用预先培训的变压器语言模型完成学术知识图的完成的一种方法。我们将知识图的三倍作为文字模型,进行三重分类(即是否属于KG)。评估表明,ExBERT在三重分类、链接预测和关系预测的任务中,优于三个学术KG的完成数据集的其他基线。此外,我们还从公众KGs和在线资源中收集了两个学术数据集,作为研究界的资源。