Wikidata is a knowledge graph increasingly adopted by many communities for diverse applications. Wikidata statements are annotated with qualifier-value pairs that are used to depict information, such as the validity context of the statement, its causality, provenances, etc. Handling the qualifiers in reasoning is a challenging problem. When defining inference rules (in particular, rules on ontological properties (x subclass of y, z instance of x, etc.)), one must consider the qualifiers, as most of them participate in the semantics of the statements. This poses a complex problem because a) there is a massive number of qualifiers, and b) the qualifiers of the inferred statement are often a combination of the qualifiers in the rule condition. In this work, we propose to address this problem by a) defining a categorization of the qualifiers b) formalizing the Wikidata model with a many-sorted logical language; the sorts of this language are the qualifier categories. We couple this logic with an algebraic specification that provides a means for effectively handling qualifiers in inference rules. The work supports the expression of all current Wikidata ontological properties. Finally, we discuss the methodology for practically implementing the work and present a prototype implementation.
翻译:Wikidata是一个知识图谱,被越来越多的社群用于各种应用。Wikidata语句标注有限定词-值对,用于描述信息,如语句的有效上下文、因果关系、来源等。在推理中处理其限定词是一个具有挑战性的问题。在定义推理规则时(特别是本体属性(x是y的子类,z是x的实例等)的规则),必须考虑限定词,因为它们大多数参与语句的语义。这造成了一个复杂的问题,因为a)有大量的限定词,b)推断语句的限定词通常是规则条件中限定词的组合。在这项工作中,我们提议通过a)定义限定词的分类,b)用多分类逻辑语言规范化Wikidata模型;此语言的分类是限定词的分类。我们将这个逻辑与代数规范相结合,提供了有效处理推理规则中的限定词的方法。该工作支持表达所有当前的Wikidata本体属性。最后,我们讨论实现该工作的实际方法,并呈现原型实现。