By providing unprecedented access to computational resources, cloud computing has enabled rapid growth in technologies such as machine learning, the computational demands of which incur a high energy cost and a commensurate carbon footprint. As a result, recent scholarship has called for better estimates of the greenhouse gas impact of AI: data scientists today do not have easy or reliable access to measurements of this information, precluding development of actionable tactics. Cloud providers presenting information about software carbon intensity to users is a fundamental stepping stone towards minimizing emissions. In this paper, we provide a framework for measuring software carbon intensity, and propose to measure operational carbon emissions by using location-based and time-specific marginal emissions data per energy unit. We provide measurements of operational software carbon intensity for a set of modern models for natural language processing and computer vision, and a wide range of model sizes, including pretraining of a 6.1 billion parameter language model. We then evaluate a suite of approaches for reducing emissions on the Microsoft Azure cloud compute platform: using cloud instances in different geographic regions, using cloud instances at different times of day, and dynamically pausing cloud instances when the marginal carbon intensity is above a certain threshold. We confirm previous results that the geographic region of the data center plays a significant role in the carbon intensity for a given cloud instance, and find that choosing an appropriate region can have the largest operational emissions reduction impact. We also show that the time of day has notable impact on operational software carbon intensity. Finally, we conclude with recommendations for how machine learning practitioners can use software carbon intensity information to reduce environmental impact.
翻译:通过提供前所未有的计算资源,云计算使机器学习等技术快速增长,而机器学习的计算需求导致高能源成本和相应的碳足迹。因此,最近的奖学金要求更好地估计AI的温室气体影响:今天的数据科学家无法轻易或可靠地获得这种信息的测量,从而无法发展可操作的战术。向用户提供关于软件碳密度的信息的云供应商是最大限度地减少排放的基本垫脚石。在本文中,我们提供了一个测量软件碳强度的框架,并提议使用基于地点和特定时间的每个能源单位边际排放数据来衡量运行中的碳排放量。我们为一套现代的自然语言处理和计算机愿景模型提供操作软件碳强度的测量,以及一系列广泛的模型规模,包括61亿参数语言模型的预培训。然后我们评价一套减少微软阿苏云平台排放的方法:使用不同地理区域的云度实例,使用不同时期的云度,以及当边际碳密度超过一定阈值时动态地测量云度的云度排放量。我们确认以往的软件强度测量结果,即,在降低碳排放量方面,在降低碳度方面,也确定了一个显著的深度影响。