不确定性关系数据的溯源方法研究

项目名称： 不确定性关系数据的溯源方法研究

项目编号： No.61202033

项目类型： 青年科学基金项目

立项/批准年度： 2013

项目学科： 计算机科学学科

项目作者： 王黎维

作者单位： 武汉大学

项目金额： 24万元

中文摘要： 数据在采集、传输、转换的过程中都会导致不确定性数据的产生。然而，许多科学应用和大规模数据管理通常需要集成和处理大量的不同来源的数据，数据的不确定性使得这些集成结果的可信度受到质疑。支持不确定性数据的溯源，提供对数据来源及处理步骤有效方便的查询支持可以帮助用户理解结果的可信度。本项目以实现不确定性关系数据的溯源管理为目标，着眼于建立不确定性关系数据的溯源模型，研究不确定性溯源信息的表达、获取、存储、查询和可视化。考虑数据的属性级和元组级不确定性，采用溯源信息的多粒度表达方法，提高其表达的灵活性；提出元组级溯源信息的压缩存储方法，并探索表级溯源信息存储以及转换为元组级溯源信息的高效方法；研究基于溯源信息及元组依赖性的结果可信度计算方法，并探索各种优化方法降低计算的复杂度；最终设计溯源信息可视化方案并构建溯源信息管理的可视化平台。

中文关键词： 不确定性；溯源表达；溯源存储；溯源查询；可信度计算

英文摘要： Uncertainty occurs during the process of data collecting, transmitting and converting, however, since many scientific applications and large-scale data managements usually require processing and integrating a large number of data from different sources, uncertainty in the data makes the credibility of these integrated results to be questioned. Supporting provenance on uncertain data to query data sources and processing steps can help users understand the credibility of the results. The aim of this project is to achieve provenance management on uncertain relational data, which focus to establish provenance model for the uncertain relational data and investigate provenance representation, acquisition, storage, retrieval and visualization on uncertain relational data. Considering uncertainty both at the attribute level and at tuple level, we adopt multi-granularity representation of the provenance to improve its flexibility. We propose the tuple-level provenance compression and storage method, and explore the table-level provenance storage and method to convert provenance from table-level to tuple-level. We also investigate credibility computation based on the provenance and tuple dependence, and try to apply some sorts of optimal technology to reduce the computational complexity. Finally, we design provenance visu

英文关键词： Uncertainty；Provenance Representation；Provenance Storage；Provenance Retrieval；Probabilistic Computation

成为VIP会员查看完整内容