Current causally consistent data storage algorithms use partial or full replication to ensure data access to clients over a distributed setting. We develop, for the first time, an erasure coding-based algorithm called CausalEC that ensures causal consistency for a collection of read-write objects stored in a distributed set of nodes over an asynchronous message-passing system. CausalEC can use an arbitrary linear erasure code for data storage and ensures liveness, fault-tolerance, and storage properties prescribed by the erasure code. CausalEC retains a key benefit of previous replication-based algorithms - every write operation is "local", that is, a server performs only local actions before returning to a client that issued a write operation. For servers that store certain objects in an uncoded manner, read operations to those objects also return locally. In general, a read operation to an object can be returned by a server on contacting a small subset of other servers so long as the underlying erasure code allows for the object to be decoded from that subset. Notably, unlike previous consistent erasure coding-based algorithms, CausalEC is compatible with cross-object erasure coding, where nodes encode values across multiple objects. CausalEC navigates the technical challenges of cross-object erasure coding, in particular, pertaining to re-encoding when writes update the values and ensuring that concurrent reads are served in a non-blocking manner during the transition to storing codeword symbols corresponding to the updated values.
翻译:暂无翻译