Link prediction based on knowledge graph embeddings (KGE) aims to predict new triples to automatically construct knowledge graphs (KGs). However, recent KGE models achieve performance improvements by excessively increasing the embedding dimensions, which may cause enormous training costs and require more storage space. In this paper, instead of training high-dimensional models, we propose MulDE, a novel knowledge distillation framework, which includes multiple low-dimensional hyperbolic KGE models as teachers and two student components, namely Junior and Senior. Under a novel iterative distillation strategy, the Junior component, a low-dimensional KGE model, asks teachers actively based on its preliminary prediction results, and the Senior component integrates teachers' knowledge adaptively to train the Junior component based on two mechanisms: relation-specific scaling and contrast attention. The experimental results show that MulDE can effectively improve the performance and training speed of low-dimensional KGE models. The distilled 32-dimensional model is competitive compared to the state-of-the-art high-dimensional methods on several widely-used datasets.
翻译:基于知识图嵌入(KGE)的链接预测旨在预测自动构建知识图(KGs)的新三重数据。然而,最近的KGE模型通过过度增加嵌入维度而实现了绩效改进,这可能造成巨大的培训成本和更多的存储空间。在本文中,我们提议了“MulDE”,而不是培训高维模型,这是一个新的知识蒸馏框架,其中包括作为教师的多种低维双曲线KGE模型和两个学生组成部分,即初级和高级。在一个新颖的迭代蒸馏战略下,初级部分,一个低维KGE模型,请教师积极以初步预测结果为基础,以及高级部分将教师的知识适应性地结合到基于两个机制的培训初级部分:特定关系的规模和对比关注。实验结果表明,MulDE能够有效地提高低维KGE模型的性能和培训速度。在几个广泛使用的数据集中,蒸馏的32维模型与最新高维方法相比具有竞争力。