项目名称: 不一致关系数据库上带信任标记的查询回答
项目编号: No.61202022
项目类型: 青年科学基金项目
立项/批准年度: 2013
项目学科: 计算机科学学科
项目作者: 吴爱华
作者单位: 上海海事大学
项目金额: 23万元
中文摘要: 不一致数据内含异常和矛盾,其上的查询结果也可能不一致,而不一致数据的纠正和剔除往往导致信息失真和信息丢失。本课题研究不一致关系数据的识别及其在查询结果中的推演和排序,在不丢失信息,不修改数据的前提下,帮用户在属性级别区分一致和不一致数据。主要研究内容有:1)在综合约束范围内,定义一种全新的不一致数据模型- - 带标记的关系数据模型,寻找不一致标记在各类查询中的推理规则集,并发展该模型上的查询代数,使得不一致标记能在查询估值中正确传承;2)研究带标记查询计算的实现,寻找不一致数据的自动检测和标识算法,给出各类用户查询到带标记的查询之间的重写算法;3)将用户对不一致数据的取舍抽象为二次标记,提出基于二次标记的不一致查询结果排序和修复算法,并针对两类标记的附属性、稀疏性和高维度性特点,给出其存储和索引方法。本课题研究成果在数据交换、数据整合、数据抽取和传感网络等多类应用中均有实际应用价值。
中文关键词: 不一致数据;数据质量;完整性约束;一致的查询回答;标记
英文摘要: Inconsistent data implies invalid information, and so do query answers over it, while its strong representation of correcting and deleting inconsistent data usually result in error or information loss. Instead, we try an approach named Annotation Based Query Answer to recognize and mark out inconsistent data down to attribute level in both source data and query result, so that valuable query answer can be returned without information loss and data change. In this approach, every piece of data in a relation can have zero or more annotations with it and annotations are propagated along with queries from the source to the output. The approach mainly focus on the next problems: 1) a data model for inconsistent database, a set of algebra queries over it and a set of rules to propagate annotations during the query evaluation, so that Annotation Based Query Answer can be correctly calculated even if the database violates multi types of integrity constraints; 2) algorithms to check and annotate the input tables, and query rewriting algorithms to translate typical user SQL query into a set of queries that can return annotation based query answer; 3) an annotation system for users to express inconsistent reason and indicate right value of inconsistent data, and algorithms to repair source data or rank query answer based
英文关键词: inconsistent data;data quality;integrity constraints;consistent query answer;annotation