Hardware accelerator for convolution neural network (CNNs) enables real time applications of artificial intelligence technology. However, most of the accelerators only support dense CNN computations or suffers complex control to support fine grained sparse networks. To solve above problem, this paper presents an efficient CNN accelerator with 1-D vector broadcasted input to support both dense network as well as vector sparse network with the same hardware and low overhead. The presented design achieves 1.93X speedup over the dense CNN computations.
翻译:神经网络变换的硬件加速器(CNNs)能够实时应用人工智能技术,然而,大多数加速器只支持密集的CNN计算,或受到复杂的控制,以支持细小的稀有网络。为了解决上述问题,本文提供了一个高效的CNN加速器,配有1D矢量广播输入,以支持密集的网络以及同样硬件和低管理量的矢量稀散网络。介绍的设计在密集的CNN计算上实现了1.93X加速。