Learning transferable representation of knowledge graphs (KGs) is challenging due to the heterogeneous, multi-relational nature of graph structures. Inspired by Transformer-based pretrained language models' success on learning transferable representation for texts, we introduce a novel inductive KG representation model (iHT) for KG completion by large-scale pre-training. iHT consists of a entity encoder (e.g., BERT) and a neighbor-aware relational scoring function both parameterized by Transformers. We first pre-train iHT on a large KG dataset, Wikidata5M. Our approach achieves new state-of-the-art results on matched evaluations, with a relative improvement of more than 25% in mean reciprocal rank over previous SOTA models. When further fine-tuned on smaller KGs with either entity and relational shifts, pre-trained iHT representations are shown to be transferable, significantly improving the performance on FB15K-237 and WN18RR.
翻译:由于图结构的异构性和多关系性,学习知识图谱(KGs)的可转移表示具有挑战性。受基于Transformer的预训练语言模型在学习文本的可转移表示方面的成功启发,我们引入了一种新颖的归纳式KG表示模型(iHT),用于通过大规模预训练来完成KG补全。iHT由一个实体编码器(例如BERT)和一个邻居感知的关系评分函数组成,两者都由Transformer参数化。我们首先在一个大型KG数据集Wikidata5M上对iHT进行预训练。我们的方法在匹配评估上实现了新的最优结果,平均倒数排名相对于之前的SOTA模型提高了25%以上。当在具有实体和关系偏移的较小KG上进一步微调时,预训练的iHT表示被证明是可转移的,显着提高了FB15K-237和WN18RR的性能。