Datacenter power demand has been continuously growing and is the key driver of its cost. An accurate mapping of compute resources (CPU, RAM, etc.) and hardware types (servers, accelerators, etc.) to power consumption has emerged as a critical requirement for major Web and cloud service providers. With the global growth in datacenter capacity and associated power consumption, such models are essential for important decisions around datacenter design and operation. In this paper, we discuss two classes of statistical power models designed and validated to be accurate, simple, interpretable and applicable to all hardware configurations and workloads across hyperscale datacenters of Google fleet. To the best of our knowledge, this is the largest scale power modeling study of this kind, in both the scope of diverse datacenter planning and real-time management use cases, as well as the variety of hardware configurations and workload types used for modeling and validation. We demonstrate that the proposed statistical modeling techniques, while simple and scalable, predict power with less than 5% Mean Absolute Percent Error (MAPE) for more than 95% diverse Power Distribution Units (more than 2000) using only 4 features. This performance matches the reported accuracy of the previous started-of-the-art methods, while using significantly less features and covering a wider range of use cases.
翻译:由于全球数据中心能力和相关电力消耗的增长,这些模型对于围绕数据中心设计和运行作出重要决定至关重要。在本文件中,我们讨论了两类设计并验证为准确、简单、可解释和适用于谷歌车队超大型数据中心所有硬件配置和工作量的统计动力模型。根据我们最了解的情况,这是对主要网络和云层服务提供商来说,对电耗的计算资源(CPU、RAM等)和硬件类型(服务器、加速器等)和硬件类型(服务器、加速器等)的精确绘图,已成为主要网络和云层服务提供商的一项关键要求。随着全球数据中心能力和相关电力消耗量的增长,这些模型对于围绕数据中心设计和运行的重要决定至关重要。我们在本文件中讨论了两类设计并经过验证的统计动力模型模型,以精确、简单、可扩缩的方式预测谷歌车队超大型数据中心的所有硬件配置和工作量。根据我们所知,这是这种类型的最大规模的动力模型研究,涉及不同的数据中心规划和实时管理案例,以及各种硬件配置和工作量类型,用于建模和验证。我们证明,拟议的统计模型技术虽然简单且可扩展,但以低于5%的平均绝对度错误预测能力(MAPE),用于超过95%的95%的多样化电力分配单位,但仅使用了4级的以往的精确度,使用率。