Existing FPGA-based DNN accelerators typically fall into two design paradigms. Either they adopt a generic reusable architecture to support different DNN networks but leave some performance and efficiency on the table because of the sacrifice of design specificity. Or they apply a layer-wise tailor-made architecture to optimize layer-specific demands for computation and resources but loose the scalability of adaptation to a wide range of DNN networks. To overcome these drawbacks, this paper proposes a novel FPGA-based DNN accelerator design paradigm and its automation tool, called DNNExplorer, to enable fast exploration of various accelerator designs under the proposed paradigm and deliver optimized accelerator architectures for existing and emerging DNN networks. Three key techniques are essential for DNNExplorer's improved performance, better specificity, and scalability, including (1) a unique accelerator design paradigm with both high-dimensional design space support and fine-grained adjustability, (2) a dynamic design space to accommodate different combinations of DNN workloads and targeted FPGAs, and (3) a design space exploration (DSE) engine to generate optimized accelerator architectures following the proposed paradigm by simultaneously considering both FPGAs' computation and memory resources and DNN networks' layer-wise characteristics and overall complexity. Experimental results show that, for the same FPGAs, accelerators generated by DNNExplorer can deliver up to 4.2x higher performances (GOP/s) than the state-of-the-art layer-wise pipelined solutions generated by DNNBuilder for VGG-like DNN with 38 CONV layers. Compared to accelerators with generic reusable computation units, DNNExplorer achieves up to 2.0x and 4.4x DSP efficiency improvement than a recently published accelerator design from academia (HybridDNN) and a commercial DNN accelerator IP (Xilinx DPU), respectively.
翻译:现有的基于 FPGA 的 DNNN 加速器通常属于两个设计范式。 它们要么采用通用的可重复使用架构来支持不同的 DNN 网络, 但由于牺牲了设计特殊性, 将某些性能和效率留在桌面上。 或者它们应用一个多层次的定制架构, 优化对计算和资源的特定层需求, 但将适应的可扩展性分散到广泛的 DNNN 网络。 为了克服这些缺陷, 本文建议建立一个新型的基于 FPGA 的 DNNN D加速器设计模式及其自动化工具, 称为 DNNExplor, 以便能够快速探索不同的加速器设计 DNNNNN网络, 并为现有和新兴的 DNNNNNC 网络提供最佳的加速器结构。 三种关键技术对于DNNTExplorlationer的性能、 更好的特性和可缩放量性 DNCFA 的D- drental- dreal- dreal- dreal- freal- dreal- developmental commaismal- dreal- dreal- dreal- deal- greal- dreal- developmental- dral- dreal- groceal- group) a a a 和制成一个由NDGADFA 和S- dir- dir- dir- hing- 和OFAFADRDMFDMIFS- 和S- 冲式的智能冲制式的模型, 和制式的模型, 和制式的发动机, 和SDFAFAFAFD- how- 和制式的发动机, 和制式的发动机, 和制式的发动机, 和制式的发动机, 和制式的发动机, 和制式的发动机和制式的发动机, 和制式的发动机, 和制式的发动机, 和制式的发动机和制式的发动机可以制式的发动机, 和制式的发动机, 和制式的内制式的内制式的内制式的内压- dRFAFAFAFAFADRDA-