Bayesian最佳优化的过度改造:经验研究和早期制止解决办法 (Overfitting in Bayesian Optimization: an empirical study and early-stopping solution)

Bayesian Optimization (BO) is a successful methodology to tune the hyperparameters of machine learning algorithms. The user defines a metric of interest, such as the validation error, and BO finds the optimal hyperparameters that minimize it. However, the metric improvements on the validation set may not translate to the test set, especially on small datasets. In other words, BO can overfit. While cross-validation mitigates this, it comes with high computational cost. In this paper, we carry out the first systematic investigation of overfitting in BO and demonstrate that this is a serious yet often overlooked concern in practice. We propose the first problem-adaptive and interpretable criterion to early stop BO, reducing overfitting while mitigating the cost of cross-validation. Experimental results on real-world hyperparameter optimization tasks show that our approach can substantially reduce compute time with little to no loss of test accuracy,demonstrating a clear practical advantage over existing techniques.

翻译：Bayesian Optimination (BO) 是调和机器学习算法的超参数的成功方法。用户定义了某种利益度量, 如验证错误, 而BO 找到了最佳的超参数, 从而将其最小化。但是, 校准集的改进度量可能不会转化为测试组, 特别是小数据集的测试组。换句话说, BO 可以超标。虽然交叉校准可以减轻这一点, 但它的计算成本却很高。在本文中, 我们首次系统调查了在BO的超配问题, 并证明这是实践中一个严重但经常被忽视的问题。我们提出了早期停用BO的第一个问题适应性和可解释的标准, 降低交叉校准的成本。现实世界超参数优化任务的实验结果表明, 我们的方法可以大量减少计算时间, 几乎不损失测试准确性, 展示了对现有技术的明显实际优势。

相关内容

过拟合

关注 8

过拟合，在AI领域多指机器学习得到模型太过复杂，导致在训练集上表现很好，然而在测试集上却不尽人意。过拟合（over-fitting）也称为过学习，它的直观表现是算法在训练集上表现好，但在测试集上表现不好，泛化性能差。过拟合是在模型参数拟合过程中由于训练数据包含抽样误差，在训练时复杂的模型将抽样误差也进行了拟合导致的。

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日