实时移动加速最佳适用 DNN 硬盘计划自动绘图 (Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration) - 专知论文

会员服务 ·

0

剪枝 · DNN · 模型评估 · 层 · 推断 ·

2021 年 11 月 22 日

Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration

翻译：实时移动加速最佳适用 DNN 硬盘计划自动绘图

Yifan Gong,Geng Yuan,Zheng Zhan,Wei Niu,Zhengang Li,Pu Zhao,Yuxuan Cai,Sijia Liu,Bin Ren,Xue Lin,Xulong Tang,Yanzhi Wang

Weight pruning is an effective model compression technique to tackle the challenges of achieving real-time deep neural network (DNN) inference on mobile devices. However, prior pruning schemes have limited application scenarios due to accuracy degradation, difficulty in leveraging hardware acceleration, and/or restriction on certain types of DNN layers. In this paper, we propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations that are applicable to any type of DNN layer while achieving high accuracy and hardware inference performance. With the flexibility of applying different pruning schemes to different layers enabled by our compiler optimizations, we further probe into the new problem of determining the best-suited pruning scheme considering the different acceleration and accuracy performance of various pruning schemes. Two pruning scheme mapping methods, one is search-based and the other is rule-based, are proposed to automatically derive the best-suited pruning regularity and block size for each layer of any given DNN. Experimental results demonstrate that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework with up to 2.48$\times$ and 1.73$\times$ DNN inference acceleration on CIFAR-10 and ImageNet dataset without accuracy loss.

翻译：在本文中,我们提出了一个通用的、精细的、结构化的裁剪计划,以及相应的编译器优化,适用于任何类型的 DNN 层,同时达到高精确度和硬件导火性能。实验结果显示,我们的编审计划配制方法,连同不精确的D-48美元和不折不扣的图像优化框架,都比D-48NFAR 结构化框架快。

0

相关内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【KDD2020】深度图神经网络专题论文解读

专知会员服务

48+阅读 · 2020年9月20日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

专知会员服务

21+阅读 · 2020年4月30日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

经典！工业界深度推荐系统与CTR预估必读的论文汇总

经典！工业界深度推荐系统与CTR预估必读的论文汇总

AINLP

34+阅读 · 2019年9月23日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI/ML/DNN硬件加速设计怎么入门？

AI/ML/DNN硬件加速设计怎么入门？

StarryHeavensAbove

11+阅读 · 2018年12月4日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

MAGMA: An Optimization Framework for Mapping Multiple DNNs on Multiple Accelerator Cores

Arxiv

0+阅读 · 2022年1月26日

Pruning Ternary Quantization

Arxiv

0+阅读 · 2022年1月26日

Real-time computer-generated light field displays by pre-storing the invariable voxel-pixel mapping

Arxiv

0+阅读 · 2022年1月26日

Distributed Image Transmission using Deep Joint Source-Channel Coding

Arxiv

0+阅读 · 2022年1月25日

S2MS: Self-Supervised Learning Driven Multi-Spectral CT Image Enhancement

Arxiv

0+阅读 · 2022年1月25日

Low Complexity Channel estimation with Neural Network Solutions

Arxiv

0+阅读 · 2022年1月24日

Efficient Data-specific Model Search for Collaborative Filtering

Efficient Data-specific Model Search for Collaborative Filtering

Arxiv

4+阅读 · 2021年6月14日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Attentive Relational Networks for Mapping Images to Scene Graphs

Arxiv

3+阅读 · 2018年11月26日

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Arxiv

4+阅读 · 2018年7月30日

VIP会员

文章信息

相关主题

相关VIP内容

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【KDD2020】深度图神经网络专题论文解读

专知会员服务

48+阅读 · 2020年9月20日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

专知会员服务

21+阅读 · 2020年4月30日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

经典！工业界深度推荐系统与CTR预估必读的论文汇总

经典！工业界深度推荐系统与CTR预估必读的论文汇总

AINLP

34+阅读 · 2019年9月23日

【泡泡汇总】CVPR2019 SLAM Paperlist

【泡泡汇总】CVPR2019 SLAM Paperlist

泡泡机器人SLAM

14+阅读 · 2019年6月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI/ML/DNN硬件加速设计怎么入门？

AI/ML/DNN硬件加速设计怎么入门？

StarryHeavensAbove

11+阅读 · 2018年12月4日

(TensorFlow)实时语义分割比较研究

(TensorFlow)实时语义分割比较研究

机器学习研究会

9+阅读 · 2018年3月12日

相关论文

MAGMA: An Optimization Framework for Mapping Multiple DNNs on Multiple Accelerator Cores

Arxiv

0+阅读 · 2022年1月26日

Pruning Ternary Quantization

Arxiv

0+阅读 · 2022年1月26日

Real-time computer-generated light field displays by pre-storing the invariable voxel-pixel mapping

Arxiv

0+阅读 · 2022年1月26日

Distributed Image Transmission using Deep Joint Source-Channel Coding

Arxiv

0+阅读 · 2022年1月25日

S2MS: Self-Supervised Learning Driven Multi-Spectral CT Image Enhancement

Arxiv

0+阅读 · 2022年1月25日

Low Complexity Channel estimation with Neural Network Solutions

Arxiv

0+阅读 · 2022年1月24日

Efficient Data-specific Model Search for Collaborative Filtering

Efficient Data-specific Model Search for Collaborative Filtering

Arxiv

4+阅读 · 2021年6月14日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Attentive Relational Networks for Mapping Images to Scene Graphs

Arxiv

3+阅读 · 2018年11月26日

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Arxiv

4+阅读 · 2018年7月30日

微信扫码咨询专知VIP会员