NoSQL document stores are becoming increasingly popular as backends in web development. Not only do they scale out to large volumes of data, many systems are even custom-tailored for this domain: NoSQL document stores like Google Cloud Datastore have been designed to support massively parallel reads, and even guarantee strong consistency in updating single data objects. However, strongly consistent updates cannot be implemented arbitrarily fast in large-scale distributed systems. Consequently, data objects that experience high-frequent writes can turn into severe performance bottlenecks. In this paper, we present AutoShard, a ready-to-use object mapper for Java applications running against NoSQL document stores. AutoShard's unique feature is its capability to gracefully shard hot spot data objects to avoid write contention. Using AutoShard, developers can easily handle hot spot data objects by adding minimally intrusive annotations to their application code. Our experiments show the significant impact of sharding on both the write throughput and the execution time.
翻译:NoSQL 文档存储正在随着网络开发的后端而越来越受欢迎。 许多系统不仅向大量数据扩展,甚至对这个域进行定制: Google Cloud Datastore 等 NoSQL 文档存储的设计是为了支持大量平行阅读,甚至保证在更新单个数据对象时具有很强的一致性。 但是,在大规模分布式系统中,无法任意地快速地进行非常一致的更新。 因此, 经历高频写入的数据对象会变成严重的性能瓶颈。 在此文件中, 我们介绍AutoShard, 这是针对 NosSQL 文档存储运行的 Java 应用程序的现用对象映射器。 AutShard 的独特特征是它能够优美地刻刻刻刻热点数据对象以避免写入争论。 开发者使用 AutoShard 能够很容易地处理热点数据对象, 在其应用程序代码中添加最小侵入性说明。 我们的实验显示, 硬化对写通过量和执行时间都有重大影响。