Continuous developments in data science have brought forth an exponential increase in complexity of machine learning models. Additionally, data scientists have become ubiquitous in the private market, academic environments and even as a hobby. All of these trends are on a steady rise, and are associated with an increase in power consumption and associated carbon footprint. The increasing carbon footprint of large-scale advanced data science has already received attention, but the latter trend has not. This work aims to estimate the contribution of the increasingly popular "common" data science to the global carbon footprint. To this end, the power consumption of several typical tasks in the aforementioned common data science tasks will be measured and compared to: large-scale "advanced" data science, common computer-related tasks, and everyday non-computer related tasks. This is done by converting the measurements to the equivalent unit of "km driven by car". Our main findings are: "common" data science consumes $2.57$ more power than regular computer usage, but less than some common everyday power-consuming tasks such as lighting or heating; large-scale data science consumes substantially more power than common data science.
翻译:数据科学的持续发展使机器学习模型的复杂性急剧增加,此外,数据科学家在私人市场、学术环境甚至兴趣爱好中变得无处不在。所有这些趋势都呈稳步上升趋势,与电力消耗和相关碳足迹的增加有关。大规模先进数据科学的碳足迹的增加已经引起注意,但后一种趋势没有引起注意。这项工作旨在估计日益流行的“共同”数据科学对全球碳足迹的贡献。为此,将对上述共同数据科学任务中若干典型任务的能量消耗量进行测量和比较:大规模“先进”数据科学、共同计算机相关任务和日常非计算机相关任务。这是通过将测量数据转换成“由汽车驱动的”等量单位来实现的。我们的主要发现是:“共同”数据科学的耗电量比正常计算机用量多2.57亿美元,但比一般的耗电量任务少,例如照明或加热;大规模数据科学的耗电量大大高于一般数据科学。