Distributed in-memory datastores underpin cloud applications that run within a datacenter and demand high performance, strong consistency, and availability. A key feature of datastores is data replication. The data are replicated across servers because a single server often cannot handle the request load. Replication is also necessary to guarantee that a server or link failure does not render a portion of the dataset inaccessible. A replication protocol is responsible for ensuring strong consistency between the replicas of a datastore, even when faults occur, by determining the actions necessary to access and manipulate the data. Consequently, a replication protocol also drives the datastore's performance. Existing strongly consistent replication protocols deliver fault tolerance but fall short in terms of performance. Meanwhile, the opposite occurs in the world of multiprocessors, where data are replicated across the private caches of different cores. The multiprocessor regime uses invalidations to afford strongly consistent replication with high performance but neglects fault tolerance. Although handling failures in the datacenter is critical for data availability, we observe that the common operation is fault-free and far exceeds the operation during faults. In other words, the common operating environment inside a datacenter closely resembles that of a multiprocessor. Based on this insight, we draw inspiration from the multiprocessor for high-performance, strongly consistent replication in the datacenter. The primary contribution of this thesis is in adapting invalidating protocols to the nuances of replicated datastores, which include skewed data accesses, fault tolerance, and distributed transactions.
翻译:模拟数据存储器中分布的数据存储器是云层应用的基础,这些应用在数据中心内部运行,要求高性能、强力一致性和可用性。数据存储的一个关键特征是数据复制。数据存储器中复制数据,因为单个服务器通常无法处理请求负荷,因此数据在服务器上复制。复制也是必要的,以保证服务器或链接故障不会使部分数据集无法进入。一个复制协议负责确保数据存储复制器(即使出现错误)之间的高度一致性,确定访问和操作数据所需的动作。因此,一个复制协议也驱动数据存储器的性能。现有非常一致的复制交易协议提供错误容忍度,但运行性能短于性能。与此同时,在多个处理器的世界中,数据复制器中的数据复制数据复制器中的数据复制程序使用无效性能来提供高度一致的复制,而数据存储中心处理错误对于数据的获取至关重要,但我们发现,通用的操作是无错的,远远超出主机操作的操作过程。在其它语言中,常规的复制操作环境是数据复制性,在数据复制过程中,我们对数据的快速进行。