Deep learning algorithms are a key component of many state-of-the-art vision systems, especially as Convolutional Neural Networks (CNN) outperform most solutions in the sense of accuracy. To apply such algorithms in real-time applications, one has to address the challenges of memory and computational complexity. To deal with the first issue, we use networks with reduced precision, specifically a binary neural network (also known as XNOR). To satisfy the computational requirements, we propose to use highly parallel and low-power FPGA devices. In this work, we explore the possibility of accelerating XNOR networks for traffic sign classification. The trained binary networks are implemented on the ZCU 104 development board, equipped with a Zynq UltraScale+ MPSoC device using two different approaches. Firstly, we propose a custom HDL accelerator for XNOR networks, which enables the inference with almost 450 fps. Even better results are obtained with the second method - the Xilinx FINN accelerator - enabling to process input images with around 550 frame rate. Both approaches provide over 96% accuracy on the test set.
翻译:深层学习算法是许多最先进的视觉系统的关键组成部分, 特别是当进化神经网络( CNN) 超强精确度超过大多数解决方案时。 要在实时应用程序中应用这种算法, 就必须解决记忆和计算复杂性的挑战。 要处理第一个问题, 我们使用精度降低的网络, 特别是二进制神经网络( 也称为 XNOR ) 。 为了满足计算要求, 我们提议使用高度平行和低功率的 FPGA 设备。 在这项工作中, 我们探索加速 XNOR 网络进行交通标志分类的可能性。 经过培训的二进制网络在ZCU 104 开发板上实施, 配备Zynq Ultrapser+ MPSoC 设备, 使用两种不同的方法。 首先, 我们提议为 XNOR 网络使用自定义的 DHL 加速器, 使推导出近 450 英尺。 以第二种方法获得更好的结果 —— Xillinx FINN 加速器 - 能够以约550 框架速度处理输入图像。 。 。 两种方法都提供超过 96% 测试设置 。