Bayesian Optimization Mixed-Precision Neural Architecture Search (BOMP-NAS) is an approach to quantization-aware neural architecture search (QA-NAS) that leverages both Bayesian optimization (BO) and mixed-precision quantization (MP) to efficiently search for compact, high performance deep neural networks. The results show that integrating quantization-aware fine-tuning (QAFT) into the NAS loop is a necessary step to find networks that perform well under low-precision quantization: integrating it allows a model size reduction of nearly 50\% on the CIFAR-10 dataset. BOMP-NAS is able to find neural networks that achieve state of the art performance at much lower design costs. This study shows that BOMP-NAS can find these neural networks at a 6x shorter search time compared to the closest related work.
翻译:Bayesian 优化优化混合精密神经结构搜索(BOMP-NAS)是一种量化智能神经结构搜索(QA-NAS)方法,它利用Bayesian优化(BO)和混合精密度量化(MP)来高效搜索紧凑高性能深神经网络。结果显示,将量化识微调(QAFT)纳入NAS环路是发现在低精度量化下运行良好的网络的一个必要步骤:整合它可以使CIFAR-10数据集的模型规模减少近50 ⁇ 。BOMP-NAS能够找到在设计成本低得多的情况下实现艺术状态的神经网络。这项研究显示,与最接近的相关工作相比,BOMP-NAS可以在6x较短的搜索时间找到这些神经网络。