Bayesian optimization (BO) is a popular method to optimize expensive black-box functions. It efficiently tunes machine learning algorithms under the implicit assumption that hyperparameter evaluations cost approximately the same. In reality, the cost of evaluating different hyperparameters, be it in terms of time, dollars or energy, can span several orders of magnitude of difference. While a number of heuristics have been proposed to make BO cost-aware, none of these have been proven to work robustly. In this work, we reformulate cost-aware BO in terms of Pareto efficiency and introduce the cost Pareto Front, a mathematical object allowing us to highlight the shortcomings of commonly used acquisition functions. Based on this, we propose a novel Pareto-efficient adaptation of the expected improvement. On 144 real-world black-box function optimization problems we show that our Pareto-efficient acquisition functions significantly outperform previous solutions, bringing up to 50% speed-ups while providing finer control over the cost-accuracy trade-off. We also revisit the common choice of Gaussian process cost models, showing that simple, low-variance cost models predict training times effectively.
翻译:Bayesian 优化( BO) 是优化昂贵黑盒功能的流行方法。 它高效地调整机器学习算法, 其隐含的假设是超参数评价成本大致相同。 事实上, 评估不同超参数的成本, 无论是时间、 美元还是能源, 都可以跨越不同程度的差别。 虽然已经提出一些超自然理论方法来提高BO的成本认知度, 但这些方法都没有被证明是行之有效的。 在这项工作中, 我们用Pareto效率来重新配置成本意识BO, 并引入成本意识BO, 并引入Pareto Front, 这是一种数学工具, 使我们能够突出常用的购置功能的缺陷。 基于这一点, 我们提议对预期的改进进行新的Pareto高效的适应。 在144个真实世界黑盒功能优化问题上, 我们显示我们的Pareto高效的获取功能大大超过以前的解决方案, 将加速速度提高到50%, 同时为成本交易提供更精确的控制。 我们还重新审视了高斯进程成本模型的共同选择, 显示简单、 低弹性成本模型有效地预测了培训时间 。