Neural architecture search (NAS) automatically finds the best task-specific neural network topology, outperforming many manual architecture designs. However, it can be prohibitively expensive as the search requires training thousands of different networks, while each can last for hours. In this work, we propose the Graph HyperNetwork (GHN) to amortize the search cost: given an architecture, it directly generates the weights by running inference on a graph neural network. GHNs model the topology of an architecture and therefore can predict network performance more accurately than regular hypernetworks and premature early stopping. To perform NAS, we randomly sample architectures and use the validation accuracy of networks with GHN generated weights as the surrogate search signal. GHNs are fast -- they can search nearly 10 times faster than other random search methods on CIFAR-10 and ImageNet. GHNs can be further extended to the anytime prediction setting, where they have found networks with better speed-accuracy tradeoff than the state-of-the-art manual designs.
翻译:神经结构搜索( NAS) 自动找到最佳任务特定神经网络的神经网络地形学, 其表现超过许多手工结构设计。 但是, 它可能非常昂贵, 因为搜索需要培训数千个不同的网络, 而每个网络可以持续数小时。 在这项工作中, 我们提议“ 超网络图( GHN) ” 来对搜索成本进行摊合: 根据一个架构, 它通过在图形神经网络上运行推论直接产生重量。 GHN 模拟一个建筑的地形学, 从而可以比常规超网络更准确地预测网络的性能, 并且提前停止。 为了执行NAS, 我们随机抽样地对结构进行抽样, 并使用具有GHN产生的重量的网络的验证准确性作为代号搜索信号 。 GHN 是快速的, 它们搜索速度可以比 CIRA- 10 和图像网络上的其他随机搜索方法快近10倍。 GHN可以进一步扩展到随时的预测设置, 在那里, 它们发现网络的速度- 准确性交换比最先进的手动的设计要快。