In the deeply interconnected world we live in, pieces of information link domains all around us. As graph databases embrace effectively relationships among data and allow processing and querying these connections efficiently, they are rapidly becoming a popular platform for storage that supports a wide range of domains and applications. As in the relational case, it is expected that data preserves a set of integrity constraints that define the semantic structure of the world it represents. When a database does not satisfy its integrity constraints, a possible approach is to search for a 'similar' database that does satisfy the constraints, also known as a repair. In this work, we study the problem of computing subset and superset repairs for graph databases with data values using a notion of consistency based on a set of Reg-GXPath expressions as integrity constraints. We show that for positive fragments of Reg-GXPath these problems admit a polynomial-time algorithm, while the full expressive power of the language renders them intractable.
翻译:在我们所生活的这个紧密相联的世界里,我们周围都有一些信息。由于图表数据库有效地包含了数据之间的关系,并且能够有效地处理和查询这些联系,它们正在迅速成为一个受欢迎的存储平台,可以支持广泛的领域和应用。同关系案例一样,预计数据保留了一套完整的限制,可以界定它所代表的世界的语义结构。当一个数据库不能满足其完整性限制时,一种可能的办法就是寻找一个“相似”的数据库,这个数据库能够满足各种限制,也称为修复。在这项工作中,我们用基于一套 Reg-GXPath 表达的完整性限制的一致概念来研究对带有数据值的图形数据库进行子集和超集修复的问题。我们表明,对于Reg-GXPath 的正面碎片来说,这些问题包含一种多元时间算法,而语言的完全表达力使它们变得难以解决。