Internet-scale distributed systems often replicate data at multiple geographic locations to provide low latency and high availability. The Conflict-free Replicated Data Type (CRDT) is a framework that provides a principled approach to maintaining eventual consistency among data replicas. CRDTs have been notoriously difficult to design and implement correctly. Subtle deep bugs lie in the complex and tedious handling of all possible cases of conflicting data updates. We argue that the CRDT design should be formally specified and model-checked to uncover deep bugs. The implementation further needs to be systematically tested. On the one hand, the testing needs to inherit the exhaustive nature of the model checking and ensures the coverage of testing. On the other hand, the testing is expected to find coding errors which cannot be detected by design level verification. Towards the challenges above, we propose the Model Checking-driven Explorative Testing (MET) framework. At the design level, MET uses TLA+ to specify and model check CRDT designs. At the implementation level, MET conducts model checking-driven explorative testing, in the sense that the test cases are automatically generated from the model checking traces. The system execution is controlled to proceed deterministically, following the model checking trace. The explorative testing systematically controls and permutes all nondeterministic message reorderings. We apply MET in our practical development of CRDTs. The bugs in both designs and implementations of CRDTs are found. As for bugs which can be found by traditional testing techniques, MET greatly reduces the cost of fixing the bugs. Moreover, MET can find subtle deep bugs which cannot be found by existing techniques at a reasonable cost. We further discuss how MET provides us with sufficient confidence in the correctness of our CRDT designs and implementations.
翻译:互联网分布式系统往往在多个地理地点复制数据,以提供低潜值和高可用性。无冲突复制数据类型(CRDT)是一个框架,为维护数据复制品的最终一致性提供了原则性方法。 CRDT一直难以正确设计和实施。 深度的错误在于复杂和冗长地处理所有可能的数据更新相互矛盾的案例。 我们主张, CRDT设计应正式指定,并进行模型检查,以发现深层错误。 实施还需要系统测试。 一方面, 需要继承模型检查的彻底性, 并确保测试的覆盖面。 另一方面, 测试将找到无法通过设计级别核查检测的编码错误。 面对上述挑战, 我们提议模型检查驱动的探索性测试框架。 在设计层面, MET 使用 TLA+ 来指定和模式检查CRDDT设计。 在实施层面, MET 将采用由常规检查驱动的模型测试降低成本性测试。 从意识上看, 测试案例是自动地遵循了模型的精确性测试, 测试系统测试系统测试系统测试系统测试系统测试。 我们发现, 测试可以使用成本性测试系统测试系统测试系统操作, 系统操作,可以找到所有系统测试系统操作。