Efficient evaluation of a network architecture drawn from a large search space remains a key challenge in Neural Architecture Search (NAS). Vanilla NAS evaluates each architecture by training from scratch, which gives the true performance but is extremely time-consuming. Recently, one-shot NAS substantially reduces the computation cost by training only one supernetwork, a.k.a. supernet, to approximate the performance of every architecture in the search space via weight-sharing. However, the performance estimation can be very inaccurate due to the co-adaption among operations. In this paper, we propose few-shot NAS that uses multiple supernetworks, called sub-supernet, each covering different regions of the search space to alleviate the undesired co-adaption. Compared to one-shot NAS, few-shot NAS improves the accuracy of architecture evaluation with a small increase of evaluation cost. With only up to 7 sub-supernets, few-shot NAS establishes new SoTAs: on ImageNet, it finds models that reach 80.5% top-1 accuracy at 600 MB FLOPS and 77.5% top-1 accuracy at 238 MFLOPS; on CIFAR10, it reaches 98.72% top-1 accuracy without using extra data or transfer learning. In Auto-GAN, few-shot NAS outperforms the previously published results by up to 20%. Extensive experiments show that few-shot NAS significantly improves various one-shot methods, including 4 gradient-based and 6 search-based methods on 3 different tasks in NasBench-201 and NasBench1-shot-1.
翻译:从大型搜索空间抽取的网络架构的有效评估仍然是神经结构搜索(NAS)中的一项关键挑战。 Vanilla NAS从零开始通过培训对每个架构进行评估,这能提供真实的性能,但耗时极多。最近,一发NAS只通过培训一个超级网络( a.k.a.supernet) 大幅降低计算成本,以通过权重共享来估计搜索空间中每个架构的性能。然而,由于各业务之间的共调,业绩估计可能非常不准确。在本文中,我们建议使用多个超级网络(称为子超级网)对每个架构进行评估,每个区域都覆盖搜索空间的不同区域,以缓解不理想的同级同级同级同级。与一发NAS相比,几发NAS通过少量增加评价成本来提高架构评估的准确性。在7个子网中,少发的NAS建立新的SoTas:在图像网中,我们发现在600 MB FLOP1 上达到80.5% 最高一级至 20TATAS,在77.5% 上找到模型, 在23838 MFLOPS- 1 上没有达到最高至最高精确度的精确度为772 。