In recent years, number of edge computing devices and artificial intelligence applications on them have advanced excessively. In edge computing, decision making processes and computations are moved from servers to edge devices. Hence, cheap and low power devices are required. FPGAs are very low power, inclined to do parallel operations and deeply suitable devices for running Convolutional Neural Networks (CNN) which are the fundamental unit of an artificial intelligence application. Face detection on surveillance systems is the most expected application on the security market. In this work, TinyYolov3 architecture is redesigned and deployed for face detection. It is a CNN based object detection method and developed for embedded systems. PYNQ-Z2 is selected as a target board which has low-end Xilinx Zynq 7020 System-on-Chip (SoC) on it. Redesigned TinyYolov3 model is defined in numerous bit width precisions with Brevitas library which brings fundamental CNN layers and activations in integer quantized form. Then, the model is trained in a quantized structure with WiderFace dataset. In order to decrease latency and power consumption, onchip memory of the FPGA is configured as a storage of whole network parameters and the last activation function is modified as rescaled HardTanh instead of Sigmoid. Also, high degree of parallelism is applied to logical resources of the FPGA. The model is converted to an HLS based application with using FINN framework and FINN-HLS library which includes the layer definitions in C++. Later, the model is synthesized and deployed. CPU of the SoC is employed with multithreading mechanism and responsible for preprocessing, postprocessing and TCP/IP streaming operations. Consequently, 2.4 Watt total board power consumption, 18 Frames-Per-Second (FPS) throughput and 0.757 mAP accuracy rate on Easy category of the WiderFace are achieved with 4 bits precision model.
翻译:近些年来, 边缘计算装置和人工智能应用程序的数量过快。 在边缘计算中, 决策程序和计算程序被从服务器移动到边缘设备。 因此, 需要低廉和低电量设备。 FPGA 的功率非常低, 倾向于平行操作, 并且非常适合运行 Convolual Neal 网络的装置。 人工智能应用程序的基本单位是配置系统。 监视系统上的脸色检测是最预期的安全市场应用程序。 在此工作中, TinyyYolov3 的架构被重新设计并部署用于面部检测。 这是基于CNN 的物体探测方法, 并且为嵌入系统开发了。 PYQQQQQQQZ2 被选为目标板, 其上具有低端 Xilinx Zynq 7020 系统运行运行和极低的移动神经神经神经网络网络运行定义。 重新设计 TyYYYYOPO 的模型和内部智能服务器运行框架是用于SGA的升级和内部智能存储的服务器, 将Slickral- flickal 的服务器的服务器运行, 的服务器的服务器的服务器运行是用于SldFIFSlick 的升级的服务器的服务器的服务器的服务器的服务器, 。 的服务器的服务器的服务器的服务器的服务器的服务器的服务器流流流流流流流流流流流流流流流流流流流流数据, 的服务器的服务器, 的服务器的服务器的服务器的服务器的服务器的服务器的服务器, 的服务器的服务器的服务器的服务器的服务器的服务器, 。