开发基于YOLOv4的压缩物体探测模型,用于在嵌入式GPU自动系统平台上部署 (Developing a Compressed Object Detection Model based on YOLOv4 for Deployment on Embedded GPU Platform of Autonomous System) - 专知论文

会员服务 ·

0

模型评估 · YOLOv4 · 目标检测 · Networking · MoDELS ·

2021 年 8 月 1 日

Developing a Compressed Object Detection Model based on YOLOv4 for Deployment on Embedded GPU Platform of Autonomous System

翻译：开发基于YOLOv4的压缩物体探测模型,用于在嵌入式GPU自动系统平台上部署

Issac Sim,Ju-Hyung Lim,Young-Wan Jang,JiHwan You,SeonTaek Oh,Young-Keun Kim

from arxiv, in Chinese language

Latest CNN-based object detection models are quite accurate but require a high-performance GPU to run in real-time. They still are heavy in terms of memory size and speed for an embedded system with limited memory space. Since the object detection for autonomous system is run on an embedded processor, it is preferable to compress the detection network as light as possible while preserving the detection accuracy. There are several popular lightweight detection models but their accuracy is too low for safe driving applications. Therefore, this paper proposes a new object detection model, referred as YOffleNet, which is compressed at a high ratio while minimizing the accuracy loss for real-time and safe driving application on an autonomous system. The backbone network architecture is based on YOLOv4, but we could compress the network greatly by replacing the high-calculation-load CSP DenseNet with the lighter modules of ShuffleNet. Experiments with KITTI dataset showed that the proposed YOffleNet is compressed by 4.7 times than the YOLOv4-s that could achieve as fast as 46 FPS on an embedded GPU system(NVIDIA Jetson AGX Xavier). Compared to the high compression ratio, the accuracy is reduced slightly to 85.8% mAP, that is only 2.6% lower than YOLOv4-s. Thus, the proposed network showed a high potential to be deployed on the embedded system of the autonomous system for the real-time and accurate object detection applications.

翻译：以CNN为基础的最新物体探测模型非常准确,但需要高性能的GPU 才能实时运行。内嵌系统内嵌系统内存空间有限, 其内存大小和速度仍然很重。由于自动系统的物体探测在嵌入处理器上运行, 最好是尽可能地将探测网络压缩为光, 以保持检测准确性。有几种流行的轻量检测模型, 但其准确性太低, 无法安全驾驶应用。因此, 本文建议采用一个新的物体探测模型, 称为 YOffleNet, 其压缩率很高, 同时将自动系统实时和安全驾驶应用程序的准确性损失降到最低。骨干网络结构以 YOLOv4 为基础, 但我们可以大大压缩网络, 将高计算- 加载的 CSP DenseNet 和 ShuffleNet 的较轻模块替换。对 KITTINet 的实验显示, 拟议的YOvleNet 的压缩速度比 YOLOv4 嵌入的46 FPS 系统( NVIDIA Jetson AGXXXXVOL) 准确性应用速度要小, 与高比例比。。将显示高比例为低的探测率为85- 网络, 。。

0

相关内容

模型评估

机器学习系统设计系统评估标准

【百度】 PP-YOLOv2使用目标检测器

专知会员服务

18+阅读 · 2021年4月24日

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

33+阅读 · 2020年4月1日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

专知会员服务

61+阅读 · 2019年12月29日

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

专知会员服务

56+阅读 · 2019年11月15日

【O'Reilly TensorFlow World 2019】使用transformer架构的自然语言处理（Natural language processing using transformer architectures），Kiwisoft的机器学习顾问Aurelien Geron

【O'Reilly TensorFlow World 2019】使用transformer架构的自然语言处理（Natural language processing using transformer architectures），Kiwisoft的机器学习顾问Aurelien Geron

专知会员服务

17+阅读 · 2019年11月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Diganta Misra等人提出新激活函数Mish，在一些任务上超越RuLU

Diganta Misra等人提出新激活函数Mish，在一些任务上超越RuLU

专知会员服务

15+阅读 · 2019年10月15日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

RTX 3090 AI性能实测：FP32训练速度提升50%，张量核心缩水

RTX 3090 AI性能实测：FP32训练速度提升50%，张量核心缩水

量子位

5+阅读 · 2020年10月3日

YOLOv4 的各种新实现、配置、测试、训练资源汇总

YOLOv4 的各种新实现、配置、测试、训练资源汇总

计算机视觉life

4+阅读 · 2020年4月29日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

极市平台

12+阅读 · 2018年8月24日

资源丨用PyTorch实现Mask R-CNN

资源丨用PyTorch实现Mask R-CNN

量子位

6+阅读 · 2018年7月23日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

Mask R-CNN 源代码终上线，Facebook 开源目标检测平台—Detectron

Mask R-CNN 源代码终上线，Facebook 开源目标检测平台—Detectron

AI100

7+阅读 · 2018年1月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

深度学习目标检测模型全面综述：Faster R-CNN、R-FCN和SSD

深度学习目标检测模型全面综述：Faster R-CNN、R-FCN和SSD

深度学习世界

10+阅读 · 2017年9月18日

A system on chip for melanoma detection using FPGA-based SVM classifier

Arxiv

0+阅读 · 2021年9月30日

3D Object Detection for Autonomous Driving: A Survey

Arxiv

12+阅读 · 2021年6月21日

Adversarial Objects Against LiDAR-Based Autonomous Driving Systems

Adversarial Objects Against LiDAR-Based Autonomous Driving Systems

Arxiv

7+阅读 · 2019年7月11日

CornerNet-Lite: Efficient Keypoint Based Object Detection

CornerNet-Lite: Efficient Keypoint Based Object Detection

Arxiv

3+阅读 · 2019年4月18日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving

Arxiv

6+阅读 · 2019年4月9日

DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection

DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection

Arxiv

3+阅读 · 2019年3月20日

Stereo R-CNN based 3D Object Detection for Autonomous Driving

Stereo R-CNN based 3D Object Detection for Autonomous Driving

Arxiv

5+阅读 · 2019年2月26日

End to End Video Segmentation for Driving : Lane Detection For Autonomous Car

Arxiv

3+阅读 · 2018年12月13日

Fire SSD: Wide Fire Modules based Single Shot Detector on Edge Device

Arxiv

3+阅读 · 2018年10月16日

VIP会员

文章信息

相关主题

相关VIP内容

【百度】 PP-YOLOv2使用目标检测器

专知会员服务

18+阅读 · 2021年4月24日

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

33+阅读 · 2020年4月1日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

如何加速NVIDIA gpu上的训练、推理和ML应用？108页ppt，Accelerating training, inference, and ML applications on NVIDIA GPUs

专知会员服务

61+阅读 · 2019年12月29日

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

【目标检测 | 2019最新综述】基于深度学习的目标检测综述，附30页PDF， A Survey of Deep Learning-based Object Detection（From Fast R-CNN to NAS-FPN）

专知会员服务

56+阅读 · 2019年11月15日

【O'Reilly TensorFlow World 2019】使用transformer架构的自然语言处理（Natural language processing using transformer architectures），Kiwisoft的机器学习顾问Aurelien Geron

【O'Reilly TensorFlow World 2019】使用transformer架构的自然语言处理（Natural language processing using transformer architectures），Kiwisoft的机器学习顾问Aurelien Geron

专知会员服务

17+阅读 · 2019年11月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Diganta Misra等人提出新激活函数Mish，在一些任务上超越RuLU

Diganta Misra等人提出新激活函数Mish，在一些任务上超越RuLU

专知会员服务

15+阅读 · 2019年10月15日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

RTX 3090 AI性能实测：FP32训练速度提升50%，张量核心缩水

RTX 3090 AI性能实测：FP32训练速度提升50%，张量核心缩水

量子位

5+阅读 · 2020年10月3日

YOLOv4 的各种新实现、配置、测试、训练资源汇总

YOLOv4 的各种新实现、配置、测试、训练资源汇总

计算机视觉life

4+阅读 · 2020年4月29日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

TorchSeg：基于pytorch的语义分割算法开源了

TorchSeg：基于pytorch的语义分割算法开源了

极市平台

20+阅读 · 2019年1月28日

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

tensorflow Object Detection API使用预训练模型mask r-cnn实现对象检测

极市平台

12+阅读 · 2018年8月24日

资源丨用PyTorch实现Mask R-CNN

资源丨用PyTorch实现Mask R-CNN

量子位

6+阅读 · 2018年7月23日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

Mask R-CNN 源代码终上线，Facebook 开源目标检测平台—Detectron

Mask R-CNN 源代码终上线，Facebook 开源目标检测平台—Detectron

AI100

7+阅读 · 2018年1月24日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

深度学习目标检测模型全面综述：Faster R-CNN、R-FCN和SSD

深度学习目标检测模型全面综述：Faster R-CNN、R-FCN和SSD

深度学习世界

10+阅读 · 2017年9月18日

相关论文

A system on chip for melanoma detection using FPGA-based SVM classifier

Arxiv

0+阅读 · 2021年9月30日

3D Object Detection for Autonomous Driving: A Survey

Arxiv

12+阅读 · 2021年6月21日

Adversarial Objects Against LiDAR-Based Autonomous Driving Systems

Adversarial Objects Against LiDAR-Based Autonomous Driving Systems

Arxiv

7+阅读 · 2019年7月11日

CornerNet-Lite: Efficient Keypoint Based Object Detection

CornerNet-Lite: Efficient Keypoint Based Object Detection

Arxiv

3+阅读 · 2019年4月18日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving

Arxiv

6+阅读 · 2019年4月9日

DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection

DC-SPP-YOLO: Dense Connection and Spatial Pyramid Pooling Based YOLO for Object Detection

Arxiv

3+阅读 · 2019年3月20日

Stereo R-CNN based 3D Object Detection for Autonomous Driving

Stereo R-CNN based 3D Object Detection for Autonomous Driving

Arxiv

5+阅读 · 2019年2月26日

End to End Video Segmentation for Driving : Lane Detection For Autonomous Car

Arxiv

3+阅读 · 2018年12月13日

Fire SSD: Wide Fire Modules based Single Shot Detector on Edge Device

Arxiv

3+阅读 · 2018年10月16日

微信扫码咨询专知VIP会员