Meta-learning hyperparameter optimization (HPO) algorithms from prior experiments is a promising approach to improve optimization efficiency over objective functions from a similar distribution. However, existing methods are restricted to learning from experiments sharing the same set of hyperparameters. In this paper, we introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction when trained on vast tuning data from the wild. Our extensive experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates. Compared to a Gaussian Process, the OptFormer also learns a robust prior distribution for hyperparameter response functions, and can thereby provide more accurate and better calibrated predictions. This work paves the path to future extensions for training a Transformer-based model as a general HPO optimizer.
翻译:从先前的实验中得出的元学习超参数优化算法是一种很有希望的方法,可以提高相对于类似分布的客观功能的优化效率。然而,现有方法仅限于从共享相同多参数的实验中学习。在本文中,我们引入了第一个基于文本的变异器 HPO 框架OptFormer,这是第一个基于文本的变异器 HPO 框架,它为联合学习政策和功能预测提供了一个通用端对端界面,在接受野生广度调控数据培训时可以进行联合学习政策和功能预测。我们的广泛实验表明,OptFormer 能够模仿至少7种不同的HPO算法,通过功能不确定性估计可以进一步改进。与Gaussian 进程相比, OptFormer 还学习了强健的超参数响应功能先前分布,从而可以提供更准确和更好的校准预测。这项工作为未来扩展一个基于变异器的模型作为普通 HPO优化器的培训铺平了道路。