联系预测中的非公用实体 (Out-of-Vocabulary Entities in Link Prediction)

Knowledge graph embedding techniques are key to making knowledge graphs amenable to the plethora of machine learning approaches based on vector representations. Link prediction is often used as a proxy to evaluate the quality of these embeddings. Given that the creation of benchmarks for link prediction is a time-consuming endeavor, most work on the subject matter uses only a few benchmarks. As benchmarks are crucial for the fair comparison of algorithms, ensuring their quality is tantamount to providing a solid ground for developing better solutions to link prediction and ipso facto embedding knowledge graphs. First studies of benchmarks pointed to limitations pertaining to information leaking from the development to the test fragments of some benchmark datasets. We spotted a further common limitation of three of the benchmarks commonly used for evaluating link prediction approaches: out-of-vocabulary entities in the test and validation sets. We provide an implementation of an approach for spotting and removing such entities and provide corrected versions of the datasets WN18RR, FB15K-237, and YAGO3-10. Our experiments on the corrected versions of WN18RR, FB15K-237, and YAGO3-10 suggest that the measured performance of state-of-the-art approaches is altered significantly with p-values <1%, <1.4%, and <1%, respectively. Overall, state-of-the-art approaches gain on average absolute $3.29 \pm 0.24\%$ in all metrics on WN18RR. This means that some of the conclusions achieved in previous works might need to be revisited. We provide an open-source implementation of our experiments and corrected datasets at at https://github.com/dice-group/OOV-In-Link-Prediction.

翻译：由于创建链接预测基准是一项耗时的工作,大多数关于主题事项的工作只使用几个基准。由于基准对于公平比较算法至关重要,确保其质量等于为开发更好的解决方案提供坚实基础,以连接预测和当然嵌入知识图。对基准的初步研究指出,从某些基准数据集的开发到测试碎片的信息泄漏到测试碎片的局限性。我们发现,在评估链接预测方法时通常使用的三个基准存在共同的限制:测试和验证组中的票外实体。我们提供了一种发现和删除这些实体的方法,并且提供了数据集WN18RRR、FB15K-237和YAGO3-10的校正版本。我们在W18RR、FB15K-237和YAGO3-10的校正版本中提供了我们从开发到测试数据集碎片的缺陷。我们发现在评估链接预测方法中通常使用的三个基准:测试和验证组中的票价实体@LGO4。我们提供了一种测量的测试结果,在O_reval-ral-al-al-al-al-al-al-al-ral-al-al-al-al-al-al-al-al-al-al-al-sal-al-al-al-al-al-lation-lation-al-lation-lational-lation-lation-lation-sxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,在Oxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,在前,在前和正正正正正正正正正正正正正正正正正正正正正正正正正正正正正正正正正正正正正