Coo: 数据库中重新思考数据异常 (Coo: Rethink Data Anomalies In Databases)

Transaction processing technology has three important contents: data anomalies, isolation levels, and concurrent control algorithms. Concurrent control algorithms are used to eliminate some or all data anomalies at different isolation levels to ensure data consistency. Isolation levels in the current ANSI standard are defined by disallowing certain kinds of data anomalies. Yet, the definitions of data anomalies in the ANSI standard are controversial. On one hand, the definitions lack a mathematical formalization and cause ambiguous interpretations. On the other hand, the definitions are made in a case-by-case manner and lead to a situation that even a senior DBA could not have infallible knowledge of data anomalies, due to a lack of a full understanding of its nature. While revised definitions in existing literature propose various mathematical formalizations to correct the former argument, how to address the latter argument still remains an open problem. In this paper, we present a general framework called Coo with the capability to systematically define data anomalies. Under this framework, we show that existing reported data anomalies are only a small portion. While we theoretically prove that Coo is complete to mathematically formalize data anomalies, we employ a novel method to classify infinite data anomalies. In addition, we use this framework to define new isolation levels and quantitatively describe the concurrency and rollback rate of mainstream concurrency control algorithms. These works show that the C and I of ACID can be quantitatively analyzed based on all data anomalies.

翻译：交易处理技术有三个重要内容: 数据异常、孤立级别和并行控制算法。同时控制算法被用于消除不同隔离级别上的某些或所有数据异常,以确保数据的一致性。目前ANSI标准中的隔离水平是通过不允许某些类型的数据异常来定义的。然而,ANSI标准中的数据异常的定义存在争议。一方面, 定义缺乏数学正规化, 并造成模糊的解释。另一方面, 定义是按个案处理的方式作出的, 导致甚至高级DBA也无法对数据异常情况有不可估量的了解, 原因是对数据异常情况缺乏全面了解。虽然现有文献中经修订的定义提出了各种数学正规化以纠正前一种观点, 如何解决后一种观点仍然是一个尚未解决的问题。在本文中, 我们提出了一个称为Coo的一般性框架, 能够系统地定义数据异常。在这个框架下, 我们显示, 现有报告的数据异常情况只是很小的一部分。虽然我们理论上证明, Coo 无法对数据异常情况进行数学正规化, 但由于对数据异常情况缺乏充分的了解, 我们使用一种新颖的方法来对无限的数据异常情况进行分类。此外, 我们使用一种新的方法来将无限的数据主流分析。。我们用这个框架来界定C 滚动的货币分析。