In recent years, zero-cost proxies are gaining ground in neural architecture search (NAS). These methods allow finding the optimal neural network for a given task faster and with a lesser computational load than conventional NAS methods. Equally important is the fact that they also shed some light on the internal workings of neural architectures. This paper presents a zero-cost metric that highly correlates with the train set accuracy across the NAS-Bench-101, NAS-Bench-201 and NAS-Bench-NLP benchmark datasets. Architectures are initialised with two distinct constant shared weights, one at a time. Then, a fixed random mini-batch of data is passed forward through each initialisation. We observe that the dispersion of the outputs between two initialisations positively correlates with trained accuracy. The correlation further improves when we normalise dispersion by average output magnitude. Our metric, epsilon, does not require gradients computation or labels. It thus unbinds the NAS procedure from training hyperparameters, loss metrics and human-labelled data. Our method is easy to integrate within existing NAS algorithms and takes a fraction of a second to evaluate a single network.
翻译:近年来,零成本替代物正在神经结构搜索(NAS)中逐渐消失。这些方法使得能够找到一个比常规NAS方法更快、计算负荷较少的给定任务的最佳神经网络。同样重要的是,它们也为神经结构的内部运行提供了一些亮点。本文提出了一个与NAS-Bench-101、NAS-Bench-201和NAS-Bench-NLP基准数据集中的列车精确度高度关联的零成本衡量标准。这些方法使为某个任务建立的最佳神经网络能够比常规NAS方法更快地找到最佳的神经网络。这些方法可以同时找到两个不同的恒定共享重量。然后,通过每个初始化过程传递一个固定的随机小批量数据。我们发现,两个初始化的输出之间的分散与经过培训的准确性有着积极的相关性。当我们按平均输出量实现正常的分布时,这种关联性会进一步改进。我们的测量标准,即 Epsilon,不需要梯度计算或标签。因此将NAS程序从培训超分计、损失计量器和人类标记第二数据中分离出来。我们的方法很容易在现有的 NAS 算算算中进行一个分数的计算。