Deep neural networks (DNNs) have the advantage that they can take into account a large number of parameters, which enables them to solve complex tasks. In computer vision and speech recognition, they have a better accuracy than common algorithms, and in some tasks, they boast an even higher accuracy than human experts. With the progress of DNNs in recent years, many other fields of application such as diagnosis of diseases and autonomous driving are taking advantage of them. The trend at DNNs is clear: The network size is growing exponentially, which leads to an exponential increase in computational effort and required memory size. For this reason, optimized hardware accelerators are used to increase the performance of the inference of neuronal networks. However, there are various neural network hardware accelerator platforms, such as graphics processing units (GPUs), application specific integrated circuits (ASICs) and field programmable gate arrays (FPGAs). Each of these platforms offer certain advantages and disadvantages. Also, there are various methods for reducing the computational effort of DNNs, which are differently suitable for each hardware accelerator. In this article an overview of existing neural network hardware accelerators and acceleration methods is given. Their strengths and weaknesses are shown and a recommendation of suitable applications is given. In particular, we focus on acceleration of the inference of convolutional neural networks (CNNs) used for image recognition tasks. Given that there exist many different hardware architectures. FPGA-based implementations are well-suited to show the effect of DNN optimization methods on accuracy and throughput. For this reason, the focus of this work is more on FPGA-based implementations.
翻译:深心神经网络(DNNS)的优点是,它们能够考虑到大量参数,从而能够解决复杂的任务。在计算机视觉和语音识别中,它们比普通算法更准确,在某些任务中,它们比人类专家的精度更高。随着近年来DNNN的进度,许多其它应用领域,如疾病诊断和自主驱动等,正在利用它们。DNNUS的趋势是显而易见的:网络规模正在成倍增长,导致计算努力和所需内存规模的急剧增加。为此,使用了优化的硬件加速器来提高神经网络的推论性能。然而,有各种神经网络的精度更准确性。有各种神经网络的硬件加速器平台,如图形处理器(GPUps)、应用特定集成电路(ASIC)和野外可编程门阵列(FPGAs)的进步。每个平台都有一定的优势和劣势。此外,有多种方法可以减少 DNNNW的计算工作,而每个硬件的精度对每个硬度准确性网络的精确性效果是不同的。在这个文章的精度网络中展示了硬度和硬度的精度。