An important step in the task of neural network design, such as hyper-parameter optimization (HPO) or neural architecture search (NAS), is the evaluation of a candidate model's performance. Given fixed computational resources, one can either invest more time training each model to obtain more accurate estimates of final performance, or spend more time exploring a greater variety of models in the configuration space. In this work, we aim to optimize this exploration-exploitation trade-off in the context of HPO and NAS for image classification by accurately approximating a model's maximal performance early in the training process. In contrast to recent accelerated NAS methods customized for certain search spaces, e.g., requiring the search space to be differentiable, our method is flexible and imposes almost no constraints on the search space. Our method uses the evolution history of features of a network during the early stages of training to build a proxy classifier that matches the peak performance of the network under consideration. We show that our method can be combined with multiple search algorithms to find better solutions to a wide range of tasks in HPO and NAS. Using a sampling-based search algorithm and parallel computing, our method can find an architecture which is better than DARTS and with an 80% reduction in wall-clock search time.
翻译:神经网络设计任务的一个重要步骤,如超参数优化(HPO)或神经结构搜索(NAS),是对候选模型性能的评价。考虑到固定的计算资源,我们可以投入更多时间培训每个模型,以获得更准确的最后性能估计,或者花更多时间探索配置空间中更多的模型。在这项工作中,我们的目标是通过精确接近模型在培训过程中早期的最高性能来优化模型的最大性能来优化模型图像分类的探索-开发权交易。与最近为某些搜索空间定制的加速NAS方法相比,例如,要求搜索空间可以不同,我们的方法是灵活的,对搜索空间几乎没有任何限制。我们的方法在培训的早期阶段利用网络特征的演变史来构建一个符合所考虑的网络最高性能的代理分类器。我们表明,我们的方法可以与多种搜索算法相结合,找到更好的办法,解决HPO和NAS的广泛任务,例如,要求搜索空间必须具有差异性,我们的方法是灵活的,我们的方法对搜索空间几乎没有设置任何限制。在早期培训阶段使用基于取样的搜索算法和平行计算中可以找到更好的80 %的搜索算法和平行计算方法。