In the distributed and dynamic framework of the Web, data quality is a big challenge. The Linked Open Data (LOD) provides an enormous amount of data, the quality of which is difficult to control. Quality is intrinsically a matter of usage, so consumers need ways to specify quality rules that make sense for their use, in order to get only data conforming to these rules. We propose a user-side query framework equipped with a checker of constraints and confidence levels on data resulting from LOD providers\' query evaluations. We detail its theoretical foundations and we provide experimental results showing that the check additional cost is reasonable and that integrating the constraints in the queries further improves it significantly.
翻译:在分布式和动态的网络框架内,数据质量是一个巨大的挑战。链接开放数据(LOD)提供了大量数据,其质量难以控制。质量本质上是一个使用问题,因此消费者需要一些方法来规定使用质量规则,以便其使用合理,只有数据才符合这些规则。我们提议了一个用户端查询框架,配有对LOD提供者查询评估数据的限制和信任度的核对器。我们详细介绍了其理论基础,并提供了实验结果,表明检查的额外费用是合理的,将限制因素纳入查询会大大改进。