几近的神经结构搜索 (Few-shot Neural Architecture Search)

Efficient evaluation of a network architecture drawn from a large search space remains a key challenge in Neural Architecture Search (NAS). Vanilla NAS evaluates each architecture by training from scratch, which gives the true performance but is extremely time-consuming. Recently, one-shot NAS substantially reduces the computation cost by training only one supernetwork, a.k.a. supernet, to approximate the performance of every architecture in the search space via weight-sharing. However, the performance estimation can be very inaccurate due to the co-adaption among operations. In this paper, we propose few-shot NAS that uses multiple supernetworks, called sub-supernet, each covering different regions of the search space to alleviate the undesired co-adaption. Compared to one-shot NAS, few-shot NAS improves the accuracy of architecture evaluation with a small increase of evaluation cost. With only up to 7 sub-supernets, few-shot NAS establishes new SoTAs: on ImageNet, it finds models that reach 80.5 top-1 at 600 MB FLOPS and 77.5 top-1 at 238 MFLOPS; on CIFAR10, it reaches 98.72 top-1 without using extra data or transfer learning. In Auto-GAN, few-shot NAS outperforms the previous published results by up to 20%. Extensive experiments show that few-shot NAS significantly improves various one-shot methods, including 4 gradient-based and 6 search-based methods on 3 different tasks in \nasbench and NasBench1-shot-1.

翻译：从大型搜索空间抽取的网络架构的有效评估仍然是神经结构搜索(NAS)中的一项关键挑战。 Vanilla NAS从零开始通过培训对每个架构进行评估,这能提供真实的性能,但耗时极多。最近,一发NAS只通过培训一个超级网络( a.k.a.supernet) 大幅降低计算成本,以通过权重共享来估计搜索空间中每个架构的性能。然而,由于操作之间的共调,业绩估计可能非常不准确。在本文中,我们建议使用多个超级网络(称为子超级网络)对每个架构进行从零开始的培训,每个区域都覆盖搜索空间以缓解不理想的共性能,但耗时费极多。与一发的NAS相比,几发NAS通过小幅提高建筑评价成本来提高建筑评估的准确性能。仅达7个子网,少发的NAS建立新的 SoTAs:在图像网中,我们发现一些模型达到80.5最高一级-1, 最高一级为600 MBES和77.5 最高一级-1;在238 MFLOSI MFLOPS;在CILOS 上,不甚甚高的搜索任务中,不甚甚甚甚甚高的CIFER10 上, 以前20 显示高级数据显示前20的高级数据,通过前20-A-S-A-S-S-S-hust-hisal-his-xxx-hex-ex-ex-ro-his-h-h-hisal-hisal-hush-h-h-hlegyal-h-h-h-hlegyal-sh-hlexxxxxxxxxxxxxxxxxxxxxxxxxx

相关内容

小样本学习

关注 215

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

【如何做研究】How to research ，22页ppt

专知会员服务

112+阅读 · 2021年4月17日

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

33+阅读 · 2020年4月1日

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

41+阅读 · 2020年3月21日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日