Neural architecture search (NAS) has seen a steep rise in interest over the last few years. Many algorithms for NAS consist of searching through a space of architectures by iteratively choosing an architecture, evaluating its performance by training it, and using all prior evaluations to come up with the next choice. The evaluation step is noisy - the final accuracy varies based on the random initialization of the weights. Prior work has focused on devising new search algorithms to handle this noise, rather than quantifying or understanding the level of noise in architecture evaluations. In this work, we show that (1) the simplest hill-climbing algorithm is a powerful baseline for NAS, and (2), when the noise in popular NAS benchmark datasets is reduced to a minimum, hill-climbing to outperforms many popular state-of-the-art algorithms. We further back up this observation by showing that the number of local minima is substantially reduced as the noise decreases, and by giving a theoretical characterization of the performance of local search in NAS. Based on our findings, for NAS research we suggest (1) using local search as a baseline, and (2) denoising the training pipeline when possible.
翻译:过去几年来,NAS的许多算法都通过迭代选择一个建筑来搜索建筑空间,通过培训来评估其性能,并使用所有先前的评估来得出下一个选择。评价步骤很吵,最终准确性因权重的随机初始化而异。先前的工作重点是设计新的搜索算法来处理这种噪音,而不是量化或理解建筑评价中的噪音水平。在这项工作中,我们显示:(1) 最简单的山坡攀爬算法是NAS的强大基线,(2) 当流行NAS基准数据集中的噪音减少到最低限度时,通过山坡攀升来超越许多流行的状态算法。我们进一步支持这一观察,显示随着噪音的减少,当地微型算法的数量大大减少,对NAS当地搜索的绩效进行理论上的描述。根据我们的调查结果,我们建议NAS研究(1) 使用当地搜索作为基线,并在可能时销毁管道培训。