Bayesian optimization (BO) is a widely popular approach for the hyperparameter optimization (HPO) in machine learning. At its core, BO iteratively evaluates promising configurations until a user-defined budget, such as wall-clock time or number of iterations, is exhausted. While the final performance after tuning heavily depends on the provided budget, it is hard to pre-specify an optimal value in advance. In this work, we propose an effective and intuitive termination criterion for BO that automatically stops the procedure if it is sufficiently close to the global optimum. Our key insight is that the discrepancy between the true objective (predictive performance on test data) and the computable target (validation performance) suggests stopping once the suboptimality in optimizing the target is dominated by the statistical estimation error. Across an extensive range of real-world HPO problems and baselines, we show that our termination criterion achieves a better trade-off between the test performance and optimization time. Additionally, we find that overfitting may occur in the context of HPO, which is arguably an overlooked problem in the literature, and show how our termination criterion helps to mitigate this phenomenon on both small and large datasets.
翻译:在机器学习中,贝叶斯优化(BO)是超光谱优化(HPO)的一个广受欢迎的方法。在机器学习中,BO的核心是反复评估前景良好的配置,直到用户定义的预算(如墙时或迭代次数)用尽。虽然调整后的最后性能严重依赖所提供的预算,但很难预先确定最佳的预估价值。在这项工作中,我们为BO提出了一个有效和直觉的终止标准,如果程序与全球最佳环境足够接近,则自动停止程序。我们的主要见解是,真实目标(测试数据预测性能)与可计算目标(估量性能)之间的差异表明,一旦优化目标的亚优度为统计估计错误所支配,就会停止。在现实世界的HPO问题和基线的广泛范围中,我们表明我们的终止标准在测试性能和优化时间之间实现更好的权衡。此外,我们发现,在HPO的背景下可能发生过度调整,这可以说是文献中被忽视的一个大问题,并表明我们的终止标准如何有助于减缓这个现象。