Today, state-of-the-art Neural Architecture Search (NAS) methods cannot scale to many hardware platforms or scenarios at a low training costs and/or can only handle non-diverse, heavily constrained architectural search-spaces. To solve these issues, we present DONNA (Distilling Optimal Neural Network Architectures), a novel pipeline for rapid and diverse NAS, that scales to many user scenarios. In DONNA, a search consists of three phases. First, an accuracy predictor is built using blockwise knowledge distillation. This predictor enables searching across diverse networks with varying macro-architectural parameters such as layer types and attention mechanisms as well as across micro-architectural parameters such as block repeats and expansion rates. Second, a rapid evolutionary search phase finds a set of Pareto-optimal architectures for any scenario using the accuracy predictor and on-device measurements. Third, optimal models are quickly finetuned to training-from-scratch accuracy. With this approach, DONNA is up to 100x faster than MNasNet in finding state-of-the-art architectures on-device. Classifying ImageNet, DONNA architectures are 20% faster than EfficientNet-B0 and MobileNetV2 on a Nvidia V100 GPU and 10% faster with 0.5% higher accuracy than MobileNetV2-1.4x on a Samsung S20 smartphone. In addition to NAS, DONNA is used for search-space extension and exploration, as well as hardware-aware model compression.
翻译:今天,最先进的神经结构搜索(NAS)方法无法以低培训成本和/或只能处理非多样化、严重受限制的建筑搜索空间。为了解决这些问题,我们介绍了DONNA(蒸馏最佳神经网络架构),这是用于快速和多样化的神经结构搜索的新管道,该管道可与许多用户情景相匹配。在NANA, 搜索由三个阶段组成。首先, 使用系统化知识蒸馏方法构建了准确性预测器。 该预测器使得能够以不同的宏观结构参数, 如层型和关注机制, 以及诸如区块重复和扩展率等微型建筑构造搜索。 其次, 一个快速进化搜索阶段, 利用精度预测器和设计仪式测量方法为任何情景找到一套最佳的Pareto结构。 第三, 最佳模型将快速调整为培训20 的精确性模型。 有了这个方法, 托纳纳氏网络在寻找智能精度更高的结构(如层类型和关注机制)中搜索速度达100x, 在SMANSNet上搜索速度快于智能结构, 和快速的SIMVO-% 用于快速的S-ROPROD-Net, 10-ROD-Net, 和10-V-ROD-ROD-V-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S