A distributed multi-writer multi-reader (MWMR) atomic register is an important primitive that enables a wide range of distributed algorithms. Hence, improving its performance can have large-scale consequences. Since the seminal work of ABD emulation in the message-passing networks [JACM '95], many researchers study fast implementations of atomic registers under various conditions. "Fast" means that a read or a write can be completed with 1 round-trip time (RTT), by contacting a simple majority. In this work, we explore an atomic register with optimal resilience and "optimistically fast" read and write operations. That is, both operations can be fast if there is no concurrent write. This paper has three contributions: (i) We present Gus, the emulation of an MWMR atomic register with optimal resilience and optimistically fast reads and writes when there are up to 5 nodes; (ii) We show that when there are > 5 nodes, it is impossible to emulate an MWMR atomic register with both properties; and (iii) We implement Gus in the framework of EPaxos and Gryff, and show that Gus provides lower tail latency than state-of-the-art systems such as EPaxos, Gryff, Giza, and Tempo under various workloads in the context of geo-replicated object storage systems.
翻译:分布式多写多读(MWMR)原子寄存器是一种重要的基元,可以支持各种分布式算法。因此,提高其性能具有大规模的影响。自消息传递网络中ABD模拟的划时代工作以来[JACM '95],许多研究人员在各种情况下研究原子寄存器的快速实现。"快"是指通过联系简单多数可以完成读或写,需要1个往返时间(RTT)。在这项工作中,我们探讨了一种具有最佳弹性和"乐观快速"读写操作的原子寄存器。也就是说,如果没有并发写入,两个操作都可以很快完成。本文有三个贡献:(i)我们提出了Gus,一种具有最佳弹性和乐观快速读写能力的MWMR原子寄存器模拟,当有最多5个节点时;(ii)我们表明,当节点数>5时,不可能模拟具有两个属性的MWMR原子寄存器;(iii)我们在EPaxos和Gryff框架中实现了Gus,并展示了Gus在地理复制对象存储系统的各种工作负载下,以及在状态现代系统如EPaxos、Gryff、Giza和Tempo下提供了更低的尾延迟。