Latest CNN-based object detection models are quite accurate but require a high-performance GPU to run in real-time. They still are heavy in terms of memory size and speed for an embedded system with limited memory space. Since the object detection for autonomous system is run on an embedded processor, it is preferable to compress the detection network as light as possible while preserving the detection accuracy. There are several popular lightweight detection models but their accuracy is too low for safe driving applications. Therefore, this paper proposes a new object detection model, referred as YOffleNet, which is compressed at a high ratio while minimizing the accuracy loss for real-time and safe driving application on an autonomous system. The backbone network architecture is based on YOLOv4, but we could compress the network greatly by replacing the high-calculation-load CSP DenseNet with the lighter modules of ShuffleNet. Experiments with KITTI dataset showed that the proposed YOffleNet is compressed by 4.7 times than the YOLOv4-s that could achieve as fast as 46 FPS on an embedded GPU system(NVIDIA Jetson AGX Xavier). Compared to the high compression ratio, the accuracy is reduced slightly to 85.8% mAP, that is only 2.6% lower than YOLOv4-s. Thus, the proposed network showed a high potential to be deployed on the embedded system of the autonomous system for the real-time and accurate object detection applications.
翻译:以CNN为基础的最新物体探测模型非常准确,但需要高性能的GPU 才能实时运行。 内嵌系统内嵌系统内存空间有限, 其内存大小和速度仍然很重。 由于自动系统的物体探测在嵌入处理器上运行, 最好是尽可能地将探测网络压缩为光, 以保持检测准确性。 有几种流行的轻量检测模型, 但其准确性太低, 无法安全驾驶应用。 因此, 本文建议采用一个新的物体探测模型, 称为 YOffleNet, 其压缩率很高, 同时将自动系统实时和安全驾驶应用程序的准确性损失降到最低。 骨干网络结构以 YOLOv4 为基础, 但我们可以大大压缩网络, 将高计算- 加载的 CSP DenseNet 和 ShuffleNet 的较轻模块替换。 对 KITTINet 的实验显示, 拟议的YOvleNet 的压缩速度比 YOLOv4 嵌入的46 FPS 系统( NVIDIA Jetson AGXXXXVOL) 准确性应用速度要小, 与高比例比 。 。 将显示高比例为低的探测率为85- 网络, 。 。