Knowledge Graph Completion (KGC) aims to reason over known facts and infer missing links but achieves weak performances on those sparse Knowledge Graphs (KGs). Recent works introduce text information as auxiliary features or apply graph densification to alleviate this challenge, but suffer from problems of ineffectively incorporating structure features and injecting noisy triples. In this paper, we solve the sparse KGC from these two motivations simultaneously and handle their respective drawbacks further, and propose a plug-and-play unified framework VEM$^2$L over sparse KGs. The basic idea of VEM$^2$L is to motivate a text-based KGC model and a structure-based KGC model to learn with each other to fuse respective knowledge into unity. To exploit text and structure features together in depth, we partition knowledge within models into two nonoverlapping parts: expressiveness ability on the training set and generalization ability upon unobserved queries. For the former, we motivate these two text-based and structure-based models to learn from each other on the training sets. And for the generalization ability, we propose a novel knowledge fusion strategy derived by the Variational EM (VEM) algorithm, during which we also apply a graph densification operation to alleviate the sparse graph problem further. Our graph densification is derived by VEM algorithm. Due to the convergence of EM algorithm, we guarantee the increase of likelihood function theoretically with less being impacted by noisy injected triples heavily. By combining these two fusion methods and graph densification, we propose the VEM$^2$L framework finally. Both detailed theoretical evidence, as well as qualitative experiments, demonstrates the effectiveness of our proposed framework.
翻译:知识图补齐( KGC) 旨在解释已知事实并推断缺失的环节,但在这些稀有的知识图( KGs) 上, VEM$2$2$L 的基本想法是激励基于文本的 KGC 模型和基于结构的 KGC 模型相互学习,将各自的知识融合到统一中。要深入利用文本和结构特征,我们在模型中将知识分成两个非重叠部分:对培训设置的清晰度能力,在未观测到的查询中进行概括化能力。对于前者,我们鼓励这两个基于文本的和基于结构的模型在分散的KGs上相互学习。VEM$2$2$L的基本想法是激励基于文本的KGC模型和基于结构的KGC模型相互学习,以便相互学习,将各自的知识整合到统一中。为了深入利用文本和结构特征,我们将知识分成两个非重叠部分: 对培训设置的清晰度能力进行表达能力,在未观测到的询问中,我们用两个基于文本和基于结构的模型的精度分析,我们建议将这些精度的精度的精度的精度计算战略结合到通过VEM 图表的精度的精度变的精度分析法, 将我们的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度的精度变的精度计算法 。