Data provenance consists in bookkeeping meta information during query evaluation, in order to enrich query results with their trust level, likelihood, evaluation cost, and more. The framework of semiring provenance abstracts from the specific kind of meta information that annotates the data. While the definition of semiring provenance is uncontroversial for unions of conjunctive queries, the picture is less clear for Datalog. Indeed, the original definition might include infinite computations, and is not consistent with other proposals for Datalog semantics over annotated data. In this work, we propose and investigate several provenance semantics, based on different approaches for defining classical Datalog semantics. We study the relationship between these semantics, and introduce properties that allow us to analyze and compare them.
翻译:数据出处在查询评估期间由簿记元信息构成,目的是用信任水平、可能性、评估成本等来丰富查询结果。 原始来源摘要的框架来自具体类型的元信息,其中注明了数据。 虽然原始来源的定义对于交汇查询的结合没有争议,但对于数据学来说,该图则不那么清楚。 事实上,原始定义可能包含无限的计算, 也不符合关于对附加说明的数据进行数据解析的其他建议。 在这项工作中, 我们根据界定古典数据语义的不同方法, 提出并调查几种原始来源语义。 我们研究这些语义之间的关系, 并引入能让我们分析和比较这些语义的属性。