Replication ensures data availability in fault-prone distributed systems. The celebrated CAP theorem stipulates that replicas cannot guarantee both strong consistency and availability under network partitions. A popular alternative, adopted by CRDTs, is to relax consistency to be eventual. It enables progress to be wait-free, as replicas can serve requests immediately. Yet, wait-free replication faces a key challenge: due to asynchrony and concurrency, operations may be constantly reordered, leading to results inconsistent with their original contexts and preventing them from stabilizing over time. Moreover, a particular client may experience starvation if, from some point on, each of its operations is reordered at least once. We make two contributions. First, we formalize the problem addressed by wait-free replicated data types (e.g., CRDTs) as eventual state-machine replication. We then augment it with stability and fairness ensuring, respectively, that (1)~all replicas share a growing stable prefix of operations, and (2)~no client starves. Second, we present a generic DAG-based framework to achieve eventual state-machine replication for any replicated data type, where replicas exchange their local views and merge them using a reconciliation function. We then propose reconciliation functions ensuring stability and fairness.
翻译:暂无翻译