In the literature on hyper-parameter tuning, a number of recent solutions rely on low-fidelity observations (e.g., training with sub-sampled datasets or for short periods of time) to extrapolate good configurations to use when performing full training. Among these, HyperBand is arguably one of the most popular solutions, due to its efficiency and theoretically provable robustness. In this work, we introduce HyperJump, a new approach that builds on HyperBand's robust search strategy and complements it with novel model-based risk analysis techniques that accelerate the search by \textit{jumping} the evaluation of low risk configurations, i.e., configurations that are likely to be discarded by HyperBand. We evaluate HyperJump on a suite of hyper-parameter optimization problems and show that it provides over one-order of magnitude speed-ups, both in sequential and parallel deployments, on a variety of deep learning, kernel-based learning, and neural architectural search problems when compared to HyperBand and to several state-of-the-art optimizers.
翻译:在关于超参数调试的文献中,最近一些解决方案依靠低纤维化观测(例如,使用子抽样数据集的培训或短期内)来推断在全面培训时使用的良好配置。其中,超光速是最受欢迎的解决方案之一,因为其效率和理论上的稳健性。在这项工作中,我们引入了超超音速,这是一种基于超光速的强力搜索战略的新办法,并辅之以基于新颖模型的风险分析技术,加速了以\textit{jump}对低风险配置的评价,即可能被超音速班德丢弃的配置。我们评估超音速超音速超音速超速超速组合在一套超光速优化问题上,并显示它提供了超一等的超音速超音速增速增速,包括连续和平行部署、各种深层学习、内核基学习和神经结构搜索问题,与超音速班德和数个状态优化器相比。