Multiplication is arguably the most cost-dominant operation in modern deep neural networks (DNNs), limiting their achievable efficiency and thus more extensive deployment in resource-constrained applications. To tackle this limitation, pioneering works have developed handcrafted multiplication-free DNNs, which require expert knowledge and time-consuming manual iteration, calling for fast development tools. To this end, we propose a Neural Architecture Search and Acceleration framework dubbed NASA, which enables automated multiplication-reduced DNN development and integrates a dedicated multiplication-reduced accelerator for boosting DNNs' achievable efficiency. Specifically, NASA adopts neural architecture search (NAS) spaces that augment the state-of-the-art one with hardware-inspired multiplication-free operators, such as shift and adder, armed with a novel progressive pretrain strategy (PGP) together with customized training recipes to automatically search for optimal multiplication-reduced DNNs; On top of that, NASA further develops a dedicated accelerator, which advocates a chunk-based template and auto-mapper dedicated for NASA-NAS resulting DNNs to better leverage their algorithmic properties for boosting hardware efficiency. Experimental results and ablation studies consistently validate the advantages of NASA's algorithm-hardware co-design framework in terms of achievable accuracy and efficiency tradeoffs. Codes are available at https://github.com/RICE-EIC/NASA.
翻译:在现代深层神经网络(DNNs)中,倍增效应可以说是最具成本优势的倍增效应,限制了它们可实现的效率,从而在资源限制的应用中更广泛地部署。为了应对这一限制,开创性工程开发了手工制作的无倍增 DNS, 需要专家知识和耗费时间的人工复制, 需要快速开发工具。 为此,我们提议了一个神经结构搜索和加速框架, 称为美国航天局, 使自动倍增- 减少 DNNS的开发得以实现, 并整合一个专门的倍增- 降加速器, 以提升DNNUS的可实现的效率。 具体地说, 美国航天局采用了神经结构搜索(NASA)空间, 以硬件驱动的无倍倍增码操作器, 例如换换和添加器, 配有新型的累进前战略(PGPGP), 配有定制的培训配方, 自动搜索最佳倍增减 DNNNSs; 此外, 美国航天局还开发了一个专门的倍增加速器, 倡导以块基模模模模制模板和自动提升美国航天局/可实现美国航天局- QRANSASASAAL- sal- trial- traliver- realal- trisalalalalalalalalal- recalalal