Most real-world knowledge graphs (KG) are far from complete and comprehensive. This problem has motivated efforts in predicting the most plausible missing facts to complete a given KG, i.e., knowledge graph completion (KGC). However, existing KGC methods suffer from two main issues, 1) the false negative issue, i.e., the sampled negative training instances may include potential true facts; and 2) the data sparsity issue, i.e., true facts account for only a tiny part of all possible facts. To this end, we propose positive-unlabeled learning with adversarial data augmentation (PUDA) for KGC. In particular, PUDA tailors positive-unlabeled risk estimator for the KGC task to deal with the false negative issue. Furthermore, to address the data sparsity issue, PUDA achieves a data augmentation strategy by unifying adversarial training and positive-unlabeled learning under the positive-unlabeled minimax game. Extensive experimental results on real-world benchmark datasets demonstrate the effectiveness and compatibility of our proposed method.
翻译:最真实世界的知识图表(KG)远非完整而全面。这个问题促使人们努力预测最可信的缺失事实,以完成给定的KG,即知识图完成(KGC),然而,现有的KGC方法存在两个主要问题:(1) 虚假的负面问题,即抽样负面培训案例可能包括潜在的真实事实;(2) 数据宽广问题,即真实事实只占所有可能事实的一小部分。为此,我们提议对KGC采用对抗性数据增强(PUDA)进行正面的、没有标签的学习。特别是,PUDA裁缝了用于KGC处理虚假负面问题的正面、没有标签的风险估计器。此外,为了解决数据宽广的问题,PUDA实现了数据扩增战略,在正面、无标签的迷你Max游戏下统一了对抗性培训和正面的无标签学习。关于真实世界基准数据集的广泛实验结果显示了我们提议的方法的有效性和兼容性。