Hyperparameter optimization (HPO) is a central pillar in the automation of machine learning solutions and is mainly performed via Bayesian optimization, where a parametric surrogate is learned to approximate the black box response function (e.g. validation error). Unfortunately, evaluating the response function is computationally intensive. As a remedy, earlier work emphasizes the need for transfer learning surrogates which learn to optimize hyperparameters for an algorithm from other tasks. In contrast to previous work, we propose to rethink HPO as a few-shot learning problem in which we train a shared deep surrogate model to quickly adapt (with few response evaluations) to the response function of a new task. We propose the use of a deep kernel network for a Gaussian process surrogate that is meta-learned in an end-to-end fashion in order to jointly approximate the response functions of a collection of training data sets. As a result, the novel few-shot optimization of our deep kernel surrogate leads to new state-of-the-art results at HPO compared to several recent methods on diverse metadata sets.
翻译:超参数优化(HPO)是机器学习解决方案自动化的一个核心支柱,主要通过Bayesian优化进行,那里学习了近似黑盒响应功能的参数替代器(例如验证错误),不幸的是,对响应功能的评估是计算密集的。作为一种补救措施,早期的工作强调,需要转移学习替代器,学习优化超参数,用于从其他任务算法。与以往的工作不同,我们提议重新思考HPO是一个微小的学习问题,在这个问题上,我们训练了一个共同的深度替代模型,以迅速适应新任务的反应功能(只有很少的反应评价)。我们提议使用一个深内核网络,用于Gaussian进程替代器,以端到端的方式进行元化学习,以便共同近似培训数据集收集的响应功能。结果是,我们深内核子套子的微小的优化使HPO得到新的最新结果,而最近对不同的元数据集采用了几种方法。