In the literature on hyper-parameter tuning, a number of recent solutions rely on low-fidelity observations (e.g., training with sub-sampled datasets) in order to efficiently identify promising configurations to be then tested via high-fidelity observations (e.g., using the full dataset). Among these, HyperBand is arguably one of the most popular solutions, due to its efficiency and theoretically provable robustness. In this work, we introduce HyperJump, a new approach that builds on HyperBand's robust search strategy and complements it with novel model-based risk analysis techniques that accelerate the search by skipping the evaluation of low risk configurations, i.e., configurations that are likely to be eventually discarded by HyperBand. We evaluate HyperJump on a suite of hyper-parameter optimization problems and show that it provides over one-order of magnitude speed-ups, both in sequential and parallel deployments, on a variety of deep-learning, kernel-based learning, and neural architectural search problems when compared to HyperBand and to several state-of-the-art optimizers.
翻译:在超光谱调试的文献中,最近的一些解决方案依靠低纤维化观测(例如,使用次级抽样数据集的培训)来有效确定前景良好的配置,然后通过高纤维化观测(例如,使用完整的数据集)进行测试。其中,HyperBand可以说是最受欢迎的解决方案之一,因为其效率高且在理论上具有可辨识的强力。在这项工作中,我们引入了HyperJump,这是一种基于HyperBand强有力搜索战略的新办法,并辅之以基于新型模型的风险分析技术,通过跳过对低风险配置的评价来加快搜索速度,即可能最终被HyperBand淘汰的配置。我们评估一套超光子仪优化问题,并表明它提供了超一等的大规模快速超音速超音速超标,包括连续部署和平行部署,以及与Syperband和数个州优化相比,提供了各种深学习、内核学习和内核建筑搜索问题。