This work presents DONNA (Distilling Optimal Neural Network Architectures), a novel pipeline for rapid neural architecture search and search space exploration, targeting multiple different hardware platforms and user scenarios. In DONNA, a search consists of three phases. First, an accuracy predictor is built for a diverse search space using blockwise knowledge distillation. This predictor enables searching across diverse macro-architectural network parameters such as layer types, attention mechanisms, and channel widths, as well as across micro-architectural parameters such as block repeats, kernel sizes, and expansion rates. Second, a rapid evolutionary search phase finds a Pareto-optimal set of architectures in terms of accuracy and latency for any scenario using the predictor and on-device measurements. Third, Pareto-optimal models can be quickly finetuned to full accuracy. With this approach, DONNA finds architectures that outperform the state of the art. In ImageNet classification, architectures found by DONNA are 20% faster than EfficientNet-B0 and MobileNetV2 on a Nvidia V100 GPU at similar accuracy and 10% faster with 0.5% higher accuracy than MobileNetV2-1.4x on a Samsung S20 smartphone. In addition to neural architecture search, DONNA is used for search-space exploration and hardware-aware model compression.
翻译:这项工作展示了DONNA( 蒸馏最佳神经网络架构架构), 这是用于快速神经结构搜索和搜索空间探索的新管道, 目标是多个不同的硬件平台和用户情景。 在 DONNA 中, 搜索由三个阶段组成 。 首先, 使用块状知识蒸馏, 为多种搜索空间建造了精确预测器。 该预测器可以搜索各种宏观构造网络参数, 如层型、 关注机制、 频道宽度, 以及跨微结构构造参数, 如区块重复、 内核大小和扩展率。 其次, 快速进化搜索阶段在使用预测器和设计测量仪的任何情景的准确性和延时性方面找到一套Pareto最佳建筑。 第三, Pareto- 优化模型可以快速调整到完全准确性。 有了这个方法, DONNA 发现各种结构超越了模型的状态。 在图像网络分类中, DONNA 发现的结构比高效的Net- B0 和移动网络2, 在Nvidia 高级的图像- 结构中, 精确度和智能智能搜索系统SOV- gPOV- 用于 10 快速的智能搜索结构。 在智能智能搜索中, 智能搜索中, 和智能搜索结构中, 10- hill- sung- hill- 和S- hy- hill- hillis- sy- sy- 10- hy- sy- sy- hillive- sy- sy- sy- hillive- 和10- sy- sy- sy- greal- greal- sy- sy- svix- sy- sy- sy- svicil- hard- sy- sy- sy- svi- svi- svi- s- s- sal- sal- s- s- s- s- s- s- sli- sli- sli- sal- lad- lad- sal- sal- la- hill- s- lad- s- s- s- s- s- s- s- s- s- s- s- s- s- s- s- s- s- s- s- s- s- s- s- s-