An essential task of Automated Machine Learning (AutoML) is the problem of automatically finding the pipeline with the best generalization performance on a given dataset. This problem has been addressed with sophisticated black-box optimization techniques such as Bayesian Optimization, Grammar-Based Genetic Algorithms, and tree search algorithms. Most of the current approaches are motivated by the assumption that optimizing the components of a pipeline in isolation may yield sub-optimal results. We present Naive AutoML, an approach that does precisely this: It optimizes the different algorithms of a pre-defined pipeline scheme in isolation. The finally returned pipeline is obtained by just taking the best algorithm of each slot. The isolated optimization leads to substantially reduced search spaces, and, surprisingly, this approach yields comparable and sometimes even better performance than current state-of-the-art optimizers.
翻译:自动机器学习(Automal)的一个基本任务是自动找到管道,在特定数据集上取得最佳的通用性能。这个问题已经通过诸如Bayesian Optimization、基于语法的语法遗传变异和树搜索算法等先进的黑箱优化技术来解决。目前大多数方法的动机是假设在隔离状态下优化管道的部件可能产生亚优效果。我们提出了“纳米自动ML ”, 这种方法正是这样做的:它优化了在隔离状态下预设的管道方案的不同算法。最后返回的管道是通过采用每个槽的最佳算法获得的。孤立的优化导致搜索空间的大幅缩小,令人惊讶的是,这一方法产生比目前最先进的优化器具有可比性,有时甚至更好的性能。