Automated machine learning (AutoML) strives for the automatic configuration of machine learning algorithms and their composition into an overall (software) solution - a machine learning pipeline - tailored to the learning task (dataset) at hand. Over the last decade, AutoML has developed into an independent research field with hundreds of contributions. At the same time, AutoML is being criticised for its high resource consumption as many approaches rely on the (costly) evaluation of many machine learning pipelines, as well as the expensive large scale experiments across many datasets and approaches. In the spirit of recent work on Green AI, this paper proposes Green AutoML, a paradigm to make the whole AutoML process more environmentally friendly. Therefore, we first elaborate on how to quantify the environmental footprint of an AutoML tool. Afterward, different strategies on how to design and benchmark an AutoML tool wrt. their "greenness", i.e. sustainability, are summarized. Finally, we elaborate on how to be transparent about the environmental footprint and what kind of research incentives could direct the community into a more sustainable AutoML research direction. Additionally, we propose a sustainability checklist to be attached to every AutoML paper featuring all core aspects of Green AutoML.
翻译:自动机器学习(Automal) 致力于自动配置机器学习算法及其构成,形成一个适合当前学习任务(数据集)的(软件)整体(软件)解决方案(机器学习管道) 。 在过去的十年中,自动ML发展成为一个独立的研究领域,有数百项贡献。与此同时,自动ML因其资源消耗率高而受到批评,因为许多方法依赖对许多机器学习管道的(成本)评价,以及许多数据集和方法的昂贵大规模实验。根据最近关于绿色AI的工作精神,本文提议绿色自动ML,这是使整个自动ML进程更加环保的范例。因此,我们首先阐述了如何量化自动ML工具的环境足迹。随后,将关于如何设计和基准自动ML工具的“绿色”(即可持续性)的不同战略加以总结。最后,我们阐述了环境足迹的透明度以及什么样的研究激励因素可以引导社区走向更可持续的自动ML研究方向。此外,我们提议将可持续性清单附在每一个自动MLLLL文件的每一个核心部分。