Knowledge graph embedding is a representation learning technique that projects entities and relations in a knowledge graph to continuous vector spaces. Embeddings have gained a lot of uptake and have been heavily used in link prediction and other downstream prediction tasks. Most approaches are evaluated on a single task or a single group of tasks to determine their overall performance. The evaluation is then assessed in terms of how well the embedding approach performs on the task at hand. Still, it is hardly evaluated (and often not even deeply understood) what information the embedding approaches are actually learning to represent. To fill this gap, we present the DLCC (Description Logic Class Constructors) benchmark, a resource to analyze embedding approaches in terms of which kinds of classes they can represent. Two gold standards are presented, one based on the real-world knowledge graph DBpedia and one synthetic gold standard. In addition, an evaluation framework is provided that implements an experiment protocol so that researchers can directly use the gold standard. To demonstrate the use of DLCC, we compare multiple embedding approaches using the gold standards. We find that many DL constructors on DBpedia are actually learned by recognizing different correlated patterns than those defined in the gold standard and that specific DL constructors, such as cardinality constraints, are particularly hard to be learned for most embedding approaches.
翻译:嵌入知识图是一种代表式学习技术,在不断向量空间的知识图中,这些实体和关系是一种代表式学习技术,在不断向量空间的知识图中,这些实体和关系是一种代表式学习技术。嵌入式学习方法已获得大量吸收,并被大量用于连接预测和其他下游预测任务。大多数方法都是在单项任务或一组任务上进行评估,以确定其总体业绩。然后,根据嵌入方法在手头任务上的表现如何评估该评价。不过,对嵌入方法的实际代表的信息很少进行评价(而且往往甚至没有深入了解)。为填补这一空白,我们提出了DLCC(解说式逻辑构造器)基准,这是分析嵌入方法的一种资源,可以用来分析它们可以代表哪一类的嵌入方法。提出了两种黄金标准,一种标准基于真实世界知识图DBpedia和一种合成黄金标准。此外,还提供了一个评价框架,用以实施实验协议,使研究人员能够直接使用金标准。为了展示DLCC的使用情况,我们比较了多种嵌入式方法。我们发现,许多DBpeorsia上的DL建构件器实际上学到了一种嵌入方法,即承认在金质标准中的最深层限制是最深的,因为在金制中最深的基系是最深的基系,在金制中最深的基系是最深的。