Object detection is widely used on embedded devices. With the wide availability of CNN (Convolutional Neural Networks) accelerator chips, the object detection applications are expected to run with low power consumption, and high inference speed. In addition, the CPU load is expected to be as low as possible for a CNN accelerator chip working as a co-processor with a host CPU. In this paper, we optimize the object detection model on the CNN accelerator chip by minimizing the CPU load. The resulting model is called GnetDet. The experimental result shows that the GnetDet model running on a 224mW chip achieves the speed of 106FPS with excellent accuracy.
翻译:嵌入设备上广泛使用天体探测。 随着CNN(进化神经网络)加速器芯片的广泛可用性,预计天体探测应用将以低电耗和高推断速度运行。 此外,预计CNN加速器芯片作为主机CPU的共同处理器运行的CPU负荷将尽可能低。 在本文中,我们通过尽量减少CPU的负荷来优化CNN加速器芯片上的天体探测模型。 由此产生的模型被称为GnetDet。 实验结果显示,在224mW芯片上运行的GnetDet模型能够非常精确地达到106FPS的速度。