Predicting neural architecture performance is a challenging task and is crucial to neural architecture design and search. Existing approaches either rely on neural performance predictors which are limited to modeling architectures in a predefined design space involving specific sets of operators and connection rules, and cannot generalize to unseen architectures, or resort to zero-cost proxies which are not always accurate. In this paper, we propose GENNAPE, a Generalized Neural Architecture Performance Estimator, which is pretrained on open neural architecture benchmarks, and aims to generalize to completely unseen architectures through combined innovations in network representation, contrastive pretraining, and fuzzy clustering-based predictor ensemble. Specifically, GENNAPE represents a given neural network as a Computation Graph (CG) of atomic operations which can model an arbitrary architecture. It first learns a graph encoder via Contrastive Learning to encourage network separation by topological features, and then trains multiple predictor heads, which are soft-aggregated according to the fuzzy membership of a neural network. Experiments show that GENNAPE pretrained on NAS-Bench-101 can achieve superior transferability to 5 different public neural network benchmarks, including NAS-Bench-201, NAS-Bench-301, MobileNet and ResNet families under no or minimum fine-tuning. We further introduce 3 challenging newly labelled neural network benchmarks: HiAML, Inception and Two-Path, which can concentrate in narrow accuracy ranges. Extensive experiments show that GENNAPE can correctly discern high-performance architectures in these families. Finally, when paired with a search algorithm, GENNAPE can find architectures that improve accuracy while reducing FLOPs on three families.
翻译:预测神经结构性能是一个具有挑战性的任务,对于神经结构的设计与搜索至关重要。 现有方法要么依靠神经性能预测器,这些预测器仅限于在预先定义的设计空间内建模结构,涉及特定的操作者和连接规则,不能向看不见的结构推广,或者采用不总是准确的零成本代理。 在本文件中,我们提议GENNAPE, 一个通用神经结构性性性能模拟器,它先在开放神经结构基准上接受过培训,目的是通过网络代表、对比性前训练、模糊性组合的准确性直线性预测器。 具体地说,GENNANAPE代表一个给定的神经网络网络网络性能图(CG),可以模拟任意结构。我们首先通过对比性学习一个图形编码,鼓励网络按地形特征进行分解,然后再培训多个软性预测器头,通过神经系统网络的软性化成员身份, 以及基于模糊性组合的直径比值、 NAS-1001 、 移动性网络前的精确度结构,可以显示在NAS-1001、 IM-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-NAR-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-N-N-S-S-S-S-S-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-N-