In the literature on hyper-parameter tuning, a number of recent solutions rely on low-fidelity observations (e.g., training with sub-sampled datasets or for short periods of time) to extrapolate good configurations to use when performing full training. Among these, HyperBand is arguably one of the most popular solutions, due to its efficiency and theoretically provable robustness. In this work, we introduce HyperJump, a new approach that builds on HyperBand's robust search strategy and complements it with novel model-based risk analysis techniques that accelerate the search by jumping the evaluation of low risk configurations, i.e., configurations that are likely to be discarded by HyperBand. We evaluate HyperJump on a suite of hyper-parameter optimization problems and show that it provides over one-order of magnitude speed-ups on a variety of deep-learning and kernel-based learning problems when compared to HyperBand as well as to a number of state of the art optimizers.
翻译:在关于超参数调试的文献中,最近的一些解决方案依靠低纤维观察(例如,使用子抽样数据集的培训或短期内)来推断在进行全面培训时使用的良好配置,其中,超光速是最受欢迎的解决方案之一,因为其效率和理论上的稳健性。在这项工作中,我们引入了超光速,这是一种基于超光速的强力搜索战略的新办法,并辅以新的基于模型的风险分析技术,通过跳过对低风险配置的评价来加快搜索,即可能被超光速板丢弃的配置。我们评估一套超光速的超光速优化问题,并表明它提供了与超光速和一些艺术优化器相比的多种深层和内核学习问题,以及一些状态的艺术优化器。