Automated Machine Learning (AutoML) supports practitioners and researchers with the tedious task of designing machine learning pipelines and has recently achieved substantial success. In this paper, we introduce new AutoML approaches motivated by our winning submission to the second ChaLearn AutoML challenge. We develop PoSH Auto-sklearn, which enables AutoML systems to work well on large datasets under rigid time limits by using a new, simple and meta-feature-free meta-learning technique and by employing a successful bandit strategy for budget allocation. However, PoSH Auto-sklearn introduces even more ways of running AutoML and might make it harder for users to set it up correctly. Therefore, we also go one step further and study the design space of AutoML itself, proposing a solution towards truly hands-free AutoML. Together, these changes give rise to the next generation of our AutoML system, Auto-sklearn 2.0. We verify the improvements by these additions in an extensive experimental study on 39 AutoML benchmark datasets. We conclude the paper by comparing to other popular AutoML frameworks and Auto-sklearn 1.0, reducing the relative error by up to a factor of 4.5, and yielding a performance in 10 minutes that is substantially better than what Auto-sklearn 1.0 achieves within an hour.
翻译:自动机器学习(Automal Learning) 支持实践者和研究人员完成设计机器学习管道的繁琐任务,最近取得了巨大成功。 在本文中,我们引入了以成功提交第二个ChaLearn AutomML 挑战为动机的新的自动ML方法。我们开发了POSH Aut-sklearn,使自动机器学习系统能够在僵硬的时间限制下,在大型数据集上运作良好,使用新的、简单和无现代的元学习技术,并采用成功的土匪战略进行预算分配。然而,POSH Auto-sklearn引入了更多运行自动学习管道的方法,并可能使用户更难以正确设置。因此,我们还进一步研究AutoLML本身的设计空间,提出了实现真正无手自动学习的解决方案。这些变化共同产生了下一代的自动移动系统,Auto-sklearn 2.0。 我们在39 AutoML基准数据集的广泛实验研究中通过这些添加的改进了这些内容。我们通过将文件与其他流行的自动移动框架进行比较,使用户更难于正确设置。 因此,自动-klean-klenal在10分钟内实现一个更好的业绩,在10分钟内大大降低一个10分。