Neural Architecture Search (NAS) is widely used to automatically obtain the neural network with the best performance among a large number of candidate architectures. To reduce the search time, zero-shot NAS aims at designing training-free proxies that can predict the test performance of a given architecture. However, as shown recently, none of the zero-shot proxies proposed to date can actually work consistently better than a naive proxy, namely, the number of network parameters (#Params). To improve this state of affairs, as the main theoretical contribution, we first reveal how some specific gradient properties across different samples impact the convergence rate and generalization capacity of neural networks. Based on this theoretical analysis, we propose a new zero-shot proxy, ZiCo, the first proxy that works consistently better than #Params. We demonstrate that ZiCo works better than State-Of-The-Art (SOTA) proxies on several popular NAS-Benchmarks (NASBench101, NATSBench-SSS/TSS, TransNASBench-101) for multiple applications (e.g., image classification/reconstruction and pixel-level prediction). Finally, we demonstrate that the optimal architectures found via ZiCo are as competitive as the ones found by one-shot and multi-shot NAS methods, but with much less search time. For example, ZiCo-based NAS can find optimal architectures with 78.1%, 79.4%, and 80.4% test accuracy under inference budgets of 450M, 600M, and 1000M FLOPs, respectively, on ImageNet within 0.4 GPU days. Our code is available at https://github.com/SLDGroup/ZiCo.
翻译:神经结构搜索(NAS)被广泛用于自动获取最佳性能的神经网络,其中由大量候选结构中选择。为了减少搜索时间,零样本NAS旨在设计无需训练即可预测给定结构测试性能的代理。然而,最近的研究表明,迄今为止提出的零样本代理实际上都不能一致地比某个简单代理(即网络参数数量#Params)表现更好。为了改善这种情况,本文首先揭示了梯度在不同样本之间的某些特定性质如何影响神经网络的收敛速度和泛化能力,进而提出了一种新的零样本代理:ZiCo。这是第一种始终优于#Params的代理。我们证明了ZiCo在多个应用场景(如图像分类/重建、像素级预测)上对几个常见NAS-Benchmark(NASBench101,NATSBench-SSS/TSS,TransNASBench-101)的预测效果都优于现有技术(State-Of-The-Art,SOTA)代理。 最后,我们证明了ZiCo找到的最优架构与一次/多次搜索得到的最佳性能相当,但搜索时间要少得多。例如,在0.4 GPU天的时间内,基于ZiCo的NAS可以在推理预算为450M、600M和1000M FLOPs的条件下,在ImageNet上实现78.1%、79.4%和80.4%的测试精度。我们的代码可在https://github.com/SLDGroup/ZiCo 上获得。