Multiplication is arguably the most cost-dominant operation in modern deep neural networks (DNNs), limiting their achievable efficiency and thus more extensive deployment in resource-constrained applications. To tackle this limitation, pioneering works have developed handcrafted multiplication-free DNNs, which require expert knowledge and time-consuming manual iteration, calling for fast development tools. To this end, we propose a Neural Architecture Search and Acceleration framework dubbed NASA, which enables automated multiplication-reduced DNN development and integrates a dedicated multiplication-reduced accelerator for boosting DNNs' achievable efficiency. Specifically, NASA adopts neural architecture search (NAS) spaces that augment the state-of-the-art one with hardware-inspired multiplication-free operators, such as shift and adder, armed with a novel progressive pretrain strategy (PGP) together with customized training recipes to automatically search for optimal multiplication-reduced DNNs; On top of that, NASA further develops a dedicated accelerator, which advocates a chunk-based template and auto-mapper dedicated for NASA-NAS resulting DNNs to better leverage their algorithmic properties for boosting hardware efficiency. Experimental results and ablation studies consistently validate the advantages of NASA's algorithm-hardware co-design framework in terms of achievable accuracy and efficiency tradeoffs. Codes are available at https://github.com/GATECH-EIC/NASA.
翻译:在现代深层神经网络(DNN)中,倍增效应可以说是最具成本优势的倍增效应,限制了它们可以实现的效率,从而在资源限制的应用中进行更广泛的部署。为了应对这一限制,开创性工程开发了手工制作的无倍增 DNN, 需要专家知识和耗费时间的人工复制, 需要快速开发工具。 为此,我们提议了一个称为NASA的神经结构搜索和加速框架, 使DNNW的开发能够实现自动倍增- 减少 DNNW的开发, 并纳入一个专门的倍增- 降加速器, 以提高DNNNW的可实现效率。 具体地说, 美国航天局采用了神经结构搜索(NAS)空间, 以硬件激励型无倍倍倍增 DNNNNNNNN, 配有新型进步前方战略(PGP) 和定制培训配方, 自动搜索最佳倍增倍增 DNNNNF; 此外, 美国航天局还开发了一个专门的倍增加速器, 倡导基于块模板和自动升级的自动升级的NASA-NASA-NASA- QSANSS- tralalal-alal-al- real- realiversal