Embedding methods have demonstrated robust performance on the task of link prediction in knowledge graphs, by mostly encoding entity relationships. Recent methods propose to enhance the loss function with a literal-aware term. In this paper, we propose KGA: a knowledge graph augmentation method that incorporates literals in an embedding model without modifying its loss function. KGA discretizes quantity and year values into bins, and chains these bins both horizontally, modeling neighboring values, and vertically, modeling multiple levels of granularity. KGA is scalable and can be used as a pre-processing step for any existing knowledge graph embedding model. Experiments on legacy benchmarks and a new large benchmark, DWD, show that augmenting the knowledge graph with quantities and years is beneficial for predicting both entities and numbers, as KGA outperforms the vanilla models and other relevant baselines. Our ablation studies confirm that both quantities and years contribute to KGA's performance, and that its performance depends on the discretization and binning settings. We make the code, models, and the DWD benchmark publicly available to facilitate reproducibility and future research.
翻译:嵌入方法显示,在知识图中的链接预测任务上表现良好,主要是将实体关系编码。最近的方法提议用字面认知的术语加强损失函数。在本文中,我们提议KGA:知识图增加方法,在不修改损失函数的情况下将字面内容纳入嵌入模型中。KGA将数量和年值分解成垃圾箱,并将这些文件夹链以横向、模拟相邻值和纵向、模拟多种颗粒水平的方式连接起来。KGA是可缩放的,可以用作任何现有知识图嵌入模型的预处理步骤。关于遗留基准和新的大基准DWD的实验表明,用数量和年来增加知识图有助于预测实体和数字,因为KGA将香草模型和其他相关基线都比成形。我们的对比研究证实,数量和年都有助于KGA的性能,其性能取决于离散和宾入环境。我们制作了代码、模型和DWD基准,以公开提供方便复制和今后研究。