Cloud platforms are increasingly emphasizing sustainable operations in order to reduce their operational carbon footprint. One approach for reducing emissions is to exploit the temporal flexibility inherent in many cloud workloads by executing them in time periods with the greenest electricity supply and suspending them at other times. Since such suspend-resume approaches can incur long delays in job completion times, we present a new approach that exploits the workload elasticity of batch workloads in the cloud to optimize their carbon emissions. Our approach is based on the notion of carbon scaling, similar to cloud autoscaling, where a job's server allocations are varied dynamically based on fluctuations in the carbon cost of the grid's electricity supply. We present an optimal greedy algorithm for minimizing a job's emissions through carbon scaling and implement a prototype of our \systemName system in Kubernetes using its autoscaling capabilities, along with an analytic tool to guide the carbon-efficient deployment of batch applications in the cloud. We evaluate CarbonScaler using real-world machine learning training and MPI jobs on a commercial cloud platform and show that \systemName can yield up to 50\% carbon savings over a carbon agnostic execution and up to 35% over the state-of-the-art suspend resume policies.
翻译:云层平台日益强调可持续操作,以减少其运行中的碳足迹。 减少排放的一个办法是利用许多云工作量中固有的时间灵活性,在最绿色的电力供应时间段里执行这些任务,并在其他时间暂停执行。 由于这种暂停回收方法可能会在完成工作完成时间里造成长期延误,我们提出了一个新方法,利用云层中批量工作量的弹性来优化其碳排放。 我们的方法基于碳比例概念,类似于云层自动膨胀,在网络电力供应的碳成本波动基础上,工作服务器的分配变化不定。 我们提出了一个最佳的贪婪算法,通过碳比例缩放来最大限度地减少工作排放量,并利用库贝涅斯的自动缩放能力,在库贝涅斯实施我们的系统Name原型,同时推出一个分析工具来指导在云中以碳效率方式部署批量应用软件。我们利用现实世界机器学习培训和MPI在商业云平台上的工作来评估碳比例,并显示\系统Name 能够通过碳比例递增到50%的碳储蓄到35个碳水平的恢复政策。