The ability of knowledge graphs to represent complex relationships at scale has led to their adoption for various needs including knowledge representation, question-answering, fraud detection, and recommendation systems. Knowledge graphs are often incomplete in the information they represent, necessitating the need for knowledge graph completion tasks, such as link and relation prediction. Pre-trained and fine-tuned language models have shown promise in these tasks although these models ignore the intrinsic information encoded in the knowledge graph, namely the entity and relation types. In this work, we propose the Knowledge Graph Language Model (KGLM) architecture, where we introduce a new entity/relation embedding layer that learns to differentiate distinctive entity and relation types, therefore allowing the model to learn the structure of the knowledge graph. In this work, we show that further pre-training the language models with this additional embedding layer using the triples extracted from the knowledge graph, followed by the standard fine-tuning phase sets a new state-of-the-art performance for the link prediction task on the benchmark datasets.
翻译:知识图表能够代表规模上复杂的关系,因此它们能够满足各种需要,包括知识代表、问答、欺诈检测和建议系统。知识图表在它们所代表的信息中往往不完全,因此需要完成知识图表的任务,例如链接和关系预测。预先培训和经过微调的语言模型在这些任务中显示出希望,尽管这些模型忽略了知识图表所编码的内在信息,即实体和关系类型。在这项工作中,我们提议了知识图表语言模型(KGLM)结构,我们在这个结构中引入一个新的实体/关系嵌入层,学会区分不同的实体和关系类型,从而允许该模型学习知识图表的结构。在这项工作中,我们表明,进一步用从知识图表中提取的三重嵌入层对语言模型进行预先培训,随后是标准的微调阶段,为基准数据集的链接预测任务规定了新的最新表现。