Scoring function (SF) measures the plausibility of triplets in knowledge graphs. Different scoring functions can lead to huge differences in link prediction performances on different knowledge graphs. In this report, we describe a weird scoring function found by random search on the open graph benchmark (OGB). This scoring function, called AutoWeird, only uses tail entity and relation in a triplet to compute its plausibility score. Experimental results show that AutoWeird achieves top-1 performance on ogbl-wikikg2 data set, but has much worse performance than other methods on ogbl-biokg data set. By analyzing the tail entity distribution and evaluation protocol of these two data sets, we attribute the unexpected success of AutoWeird on ogbl-wikikg2 to inappropriate evaluation and concentrated tail entity distribution. Such results may motivate further research on how to accurately evaluate the performance of different link prediction methods for knowledge graphs.
翻译:Scorizing 函数( SF) 测量了知识图形中三胞胎的可信度。 不同的评分功能可能导致不同知识图形中链接预测性能的巨大差异。 在本报告中, 我们描述了在开放图形基准( OGB) 上随机搜索发现的一个奇怪的评分功能。 这个评分函数叫做 AutoWeird, 仅使用尾体和三胞胎关系来计算其可信度。 实验结果显示 AutoWeird 在 ogbl-wikikgg2 数据集中取得了上一级性能, 但与 ogbl- bokikggn2 数据集中的其他方法相比表现要差得多。 通过分析这两个数据集的尾体分布和评估协议, 我们将AutoWeird在 ogbl-wikikkg2 上的意外成功归结为不适当的评价和集中尾部实体分布。 这些结果可能促使进一步研究如何准确评估知识图形不同链接预测方法的性能。