A key component in Neural Architecture Search (NAS) is an accuracy predictor which asserts the accuracy of a queried architecture. To build a high quality accuracy predictor, conventional NAS algorithms rely on training a mass of architectures or a big supernet. This step often consumes hundreds to thousands of GPU days, dominating the total search cost. To address this issue, we propose to replace the accuracy predictor with a novel model-complexity index named Zen-score. Instead of predicting model accuracy, Zen-score directly asserts the model complexity of a network without training its parameters. This is inspired by recent advances in deep learning theories which show that model complexity of a network positively correlates to its accuracy on the target dataset. The computation of Zen-score only takes a few forward inferences through a randomly initialized network using random Gaussian input. It is applicable to any Vanilla Convolutional Neural Networks (VCN-networks) or compatible variants, covering a majority of networks popular in real-world applications. When combining Zen-score with Evolutionary Algorithm, we obtain a novel Zero-Shot NAS algorithm named Zen-NAS. We conduct extensive experiments on CIFAR10/CIFAR100 and ImageNet. In summary, Zen-NAS is able to design high performance architectures in less than half GPU day (12 GPU hours). The resultant networks, named ZenNets, achieve up to $83.0\%$ top-1 accuracy on ImageNet. Comparing to EfficientNets-B3/B5 of the same or better accuracies, ZenNets are up to $5.6$ times faster on NVIDIA V100, $11$ times faster on NVIDIA T4, $2.6$ times faster on Google Pixel2 and uses $50\%$ less FLOPs. Our source code and pre-trained models are released on https://github.com/idstcv/ZenNAS.
翻译:神经架构搜索(NAS) 中的一个关键组件是一个精密预测器, 显示被查询的架构的准确性。 要建立高质量的精准预测器, 常规的NAS 算法依赖于培训大量的建筑或大型超级网。 这个步骤通常消耗数百至数千个GPU日, 支配搜索总成本。 为了解决这个问题, 我们提议用名为 Zen- Score 的新型模型复杂度指数来取代准确性预测器。 Zen- score 直径显示网络的模型复杂性, 而没有训练其参数。 为了建立高质量的精确性预测器, 常规的NAS 常规 NAS, 常规性网络的复杂度与目标数据集的准确性有关。 计算Zen- scorecore 仅通过随机初始化的网络( 随机化的 Gaus 输入) 。 它适用于Vanilla Convolualal Neural Network (VCN- Nets) 或可兼容的变数, 覆盖在现实世界应用程序中最受欢迎的大多数网络 。 当将Zen- sheal- sheal- ladeal- ladeal- fal- sal- des ladeal- z lax ladeal z lax zal zal lax lax laudal lax lax z lax lautes 时, lade Outs lauts lauts lauts lauts lauts