This paper presents a framework for assessing data and metadata quality within Open Data portals. Although a few benchmark frameworks already exist for this purpose, they are not yet detailed enough in both breadth and depth to make valid statements about the actual discoverability and accessibility of publicly available data collections. To address this research gap, we have designed a quality framework that is able to evaluate data quality in Open Data portals on dedicated and fine-grained dimensions, such as interoperability, findability, uniqueness or completeness. Additionally, we propose quality measures that allow for valid assessments regarding cross-portal findability and uniqueness of dataset descriptions. We have validated our novel quality framework for the German Open Data landscape and found out that metadata often still lacks meaningful descriptions and is not yet extensively connected to the Semantic Web.
翻译:本文件为在开放数据门户网站内评估数据和元数据质量提供了一个框架。虽然已经为此建立了几个基准框架,但是这些基准框架在广度和深度方面都不够详细,不足以对可公开获取的数据收集的实际可发现性和可获取性作出有效的说明。为弥补这一研究差距,我们设计了一个质量框架,能够对开放数据门户网站中专门和细化的层面的数据质量进行评估,如互操作性、可找到性、独特性或完整性。此外,我们提出了质量措施,以便能够对数据集描述的跨端可发现性和独特性进行有效的评估。我们验证了德国开放数据全景的新的质量框架,发现元数据往往仍然缺乏有意义的描述,而且尚未与Semantic网站广泛连接。