Spot instances are virtual machines offered at 60-90% lower cost that can be reclaimed at any time, with only a short warning period. Spot instances have already been used to significantly reduce the cost of processing workloads in the cloud. However, leveraging spot instances to reduce the cost of stateful cloud applications is much more challenging, as the sudden preemptions lead to data loss. In this work, we propose leveraging spot instances to decrease the cost of ephemeral data management in distributed data analytics applications. We specifically target ephemeral data as this large class of data in modern analytics workloads has low durability requirements; if lost, the data can be regenerated by re-executing compute tasks. We design an elastic, distributed ephemeral datastore that handles node preemptions transparently to user applications and minimizes data loss by redistributing data during node preemption warning periods. We implement our elastic datastore on top of the Apache Crail datastore and evaluate the system with various workloads and VM types. By leveraging spot instances, we show that we can run TPC-DS queries with 60\% lower cost compared to using on-demand VMs for the datastore, while only increasing end-to-end execution time by 2.1%.
翻译:亮点实例是虚拟机器,其成本在任何时间都可以以60-90 % 降低60-90 %, 并且只有短暂的警告期。 亮点实例已经被用来大幅降低云中处理工作量的成本。 然而, 利用亮点实例降低有色云应用的成本更具挑战性, 因为突如其来的先发事件会导致数据丢失。 在这项工作中, 我们提议在分布式数据分析应用中利用亮点实例降低瞬时数据管理的成本。 我们具体针对小点数据, 因为现代分析工作量中的这一大类数据具有较低的耐久性要求; 如果丢失了, 数据可以通过重新执行计算任务重新生成。 我们设计一个弹性的分布式缩略线数据存储器, 以透明的方式处理用户应用的节点预设, 并尽可能减少数据损失。 我们用缩放式数据存储器在阿帕奇 Crail数据存储器顶端, 并用各种工作量和 VM 类型来评估系统。 通过利用点实例, 我们显示, 我们只能使用时间运行 TPC- 扩展 数据端端端对 VMS 的检查, 而只能用60- dex 进行时间访问, 将驱动 将数据从驱动到端点对VMS+Q 到终端查询。