Quantifying the inconsistency of a database is motivated by various goals including reliability estimation for new datasets and progress indication in data cleaning. Another goal is to attribute to individual tuples a level of responsibility to the overall inconsistency, and thereby prioritize tuples in the explanation or inspection of dirt. Therefore, inconsistency quantification and attribution have been a subject of much research in Knowledge Representation and, more recently, in Databases. As in many other fields, a conventional responsibility sharing mechanism is the Shapley value from cooperative game theory. In this paper, we carry out a systematic investigation of the complexity of the Shapley value in common inconsistency measures for functional-dependency (FD) violations. For several measures we establish a full classification of the FD sets into tractable and intractable classes with respect to Shapley-value computation. We also study the complexity of approximation in intractable cases.
翻译:衡量数据库不一致程度的动机是各种目标,包括新数据集的可靠性估计和数据清理进度指示等,另一个目标是将总体不一致程度的责任归于个人,从而在解释或检查污秽时优先考虑图例,因此,在知识代表性和最近数据库中,对不一致的量化和归因进行了大量研究,与许多其他领域一样,传统责任分担机制是合作游戏理论的“暗淡价值”。在本文件中,我们系统地调查了功能依赖(FD)违法行为的共同不一致性措施中毛骨悚然值的复杂性。关于若干措施,我们把FD组完全分类为可移动和棘手的类别,与Sapley值计算有关。我们还研究了棘手案件中近似复杂性。