Computation and Data Reuse is critical for the resource-limited Convolutional Neural Network (CNN) accelerators. This paper presents Universal Computation Reuse to exploit weight sparsity, repetition, and similarity simultaneously in a convolutional layer. Moreover, CoDR decreases the cost of weight memory access by proposing a customized Run-Length Encoding scheme and the number of memory accesses to the intermediate results by introducing an input and output stationary dataflow. Compared to two recent compressed CNN accelerators with the same area of 2.85 mm^2, CoDR decreases SRAM access by 5.08x and 7.99x, and consumes 3.76x and 6.84x less energy.
翻译:计算和数据再利用对于资源有限的进化神经网络加速器至关重要,本文件介绍了在进化层同时利用重量宽度、重复和相似性的通用计算再利用,此外,CODR提出一个定制的 Run-Length 编码计划和中间结果的内存存存取数,方法是引入输入和输出固定数据流。与最近两个压缩的CNN加速器相比,其面积为2.85毫米2, CODR将SRAM访问量减少5.08x和7.99x,耗用3.76x和6.84x的能量减少。