OLTP applications with high workloads that cannot be served by a single server need to scale out to multiple servers. Typically, scaling out entails assigning a different partition of the application state to each server. But data partitioning is at odds with preserving the strong consistency guarantees of ACID transactions, a fundamental building block of many OLTP applications. The more we scale out and spread data across multiple servers, the more frequent distributed transactions accessing data at different servers will be. With a large number of servers, the high cost of distributed transactions makes scaling out ineffective or even detrimental. In this paper we propose Operation Partitioning, a novel paradigm to scale out OLTP applications that require ACID guarantees. Operation Partitioning indirectly partitions data across servers by partitioning the application's operations through static analysis. This partitioning of operations yields to a lock-free Conveyor Belt protocol for distributed coordination, which can scale out unmodified applications running on top of unmodified database management systems. We implement the protocol in a system called Elia and use it to scale out two applications, TPC-W and RUBiS. Our experiments show that Elia can increase maximum throughput by up to 4.2x and reduce latency by up to 58.6x compared to MySQL Cluster while at the same time providing a stronger isolation guarantee (serializability instead of read committed).
翻译:任务繁重的 OLTP 应用程序, 无法由单个服务器服务, 其工作量巨大的 OLTP 应用程序需要向多个服务器扩展。 通常, 扩展需要向每个服务器分配不同的应用状态。 但是, 数据分割与维护ACID 交易的强大一致性保证不相容, 这是许多 OLTP 应用程序的基本构件。 我们越是在多个服务器上推广和传播数据, 越是将数据分散到多个服务器上, 获取不同服务器数据的分散交易就会越频繁。 由于服务器数量庞大, 分布式交易的高昂成本使得扩大规模变得无效甚至有害。 在本文中, 我们提议“ 分解行动”, 这是扩大需要ACID 保证的OLTP 应用程序的新模式。 但是, 数据分割与维护 ACID 交易的强大一致性是矛盾的。 这种操作分割会通过静态分析将应用程序的操作分成间接分割跨服务器的数据 。 将操作分解成一个无锁的 Conveyor Belt 协议, 用于分布式协调, 这可以扩大在不完善的数据库管理系统上。 我们在一个名为 Elia 的系统中实施协议, 并使用它来扩大两个应用程序, TPC- W 和 RUBIS 。 我们的实验显示Elix 能够通过 4.2 来增加 。