通过分阶段 " 谨慎 " 有效压缩模型 (Effective Model Compression via Stage-wise Pruning) - 专知论文

会员服务 ·

0

剪枝 · 可约的 · Performer · CC · Automator ·

2021 年 9 月 22 日

Effective Model Compression via Stage-wise Pruning

翻译：通过分阶段 " 谨慎 " 有效压缩模型

Mingyang Zhang,Xinyi Yu,Jingtao Rong,Linlin Ou

Automated Machine Learning(Auto-ML) pruning methods aim at searching a pruning strategy automatically to reduce the computational complexity of deep Convolutional Neural Networks(deep CNNs). However, some previous work found that the results of many Auto-ML pruning methods cannot even surpass the results of the uniformly pruning method. In this paper, the ineffectiveness of Auto-ML pruning which is caused by unfull and unfair training of the supernet is shown. A deep supernet suffers from unfull training because it contains too many candidates. To overcome the unfull training, a stage-wise pruning(SWP) method is proposed, which splits a deep supernet into several stage-wise supernets to reduce the candidate number and utilize inplace distillation to supervise the stage training. Besides, A wide supernet is hit by unfair training since the sampling probability of each channel is unequal. Therefore, the fullnet and the tinynet are sampled in each training iteration to ensure each channel can be overtrained. Remarkably, the proxy performance of the subnets trained with SWP is closer to the actual performance than that of most of the previous Auto-ML pruning work. Experiments show that SWP achieves the state-of-the-art on both CIFAR-10 and ImageNet under the mobile setting.

翻译：自动机修补(自动- ML) 运行方法旨在自动搜索一个修补策略, 以便自动降低深层进化神经网络( 深度CNN) 的计算复杂性。但是, 先前的一些工作发现, 许多自动- ML 修补方法的结果甚至不能超过统一修补方法的结果。本文显示了自动修补( 自动修补) 的无效操作方法, 原因是对超级网进行不完全和不公平的培训。一个深度的超级网因为包含过多的候选人而受到不完全的培训。为了克服不完全的培训, 提议了一个阶段错开的运行( SWP) 方法, 将一个深度的超级网分割成几个阶段错开的超级网, 以减少候选数, 并利用本地蒸馏来监督舞台训练。此外, 一个大的超级网受到不公平的培训的打击, 因为每个频道的取样概率不平等。因此, 在每个频道的训练中, 全网和小网都受到抽样, 以确保每个频道都可能被过度训练。值得注意的是, 与SWP 训练过的子网的代理性工作表现接近于S- WP IMAR 之前和 IMAR 运行最接近于的图像的自动实验状态。

0

相关内容

【CVPR2021】一种基于知识蒸馏的弱监督图像文本匹配模型

专知会员服务

35+阅读 · 2021年4月8日

【AAAI2021】归纳关系推理的传递信息传递

专知会员服务

47+阅读 · 2020年12月20日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

43+阅读 · 2020年4月22日

【Google-CMU】元伪标签的元学习，Meta Pseudo Labels

【Google-CMU】元伪标签的元学习，Meta Pseudo Labels

专知会员服务

32+阅读 · 2020年3月30日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【AAAI2020论文】小样本网络压缩，Few Shot Network Compression via Cross Distillation (附pdf）

专知会员服务

26+阅读 · 2019年11月23日

【ICCV 2019】基于元学习的自动化神经网络通道 MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning

【ICCV 2019】基于元学习的自动化神经网络通道 MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning

专知会员服务

17+阅读 · 2019年11月17日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

BERT 瘦身之路：Distillation，Quantization，Pruning

BERT 瘦身之路：Distillation，Quantization，Pruning

AINLP

10+阅读 · 2019年10月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

Only Train Once: A One-Shot Neural Network Training And Pruning Framework

Arxiv

0+阅读 · 2021年11月12日

Deep Contextual Video Compression

Arxiv

5+阅读 · 2021年9月30日

Semantic Models for the First-stage Retrieval: A Comprehensive Review

Arxiv

20+阅读 · 2021年9月17日

Residual Feature Distillation Network for Lightweight Image Super-Resolution

Arxiv

4+阅读 · 2020年9月24日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Resolution Adaptive Networks for Efficient Inference

Arxiv

5+阅读 · 2020年3月16日

The Knowledge Within: Methods for Data-Free Model Compression

The Knowledge Within: Methods for Data-Free Model Compression

Arxiv

3+阅读 · 2019年12月3日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Arxiv

4+阅读 · 2018年7月30日

VIP会员

文章信息

相关主题

相关VIP内容

【CVPR2021】一种基于知识蒸馏的弱监督图像文本匹配模型

专知会员服务

35+阅读 · 2021年4月8日

【AAAI2021】归纳关系推理的传递信息传递

专知会员服务

47+阅读 · 2020年12月20日

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

【NLP模型压缩方法综述】《A Survey of Methods for Model Compression in NLP》by Madison May

专知会员服务

43+阅读 · 2020年4月22日

【Google-CMU】元伪标签的元学习，Meta Pseudo Labels

【Google-CMU】元伪标签的元学习，Meta Pseudo Labels

专知会员服务

32+阅读 · 2020年3月30日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【AAAI2020论文】小样本网络压缩，Few Shot Network Compression via Cross Distillation (附pdf）

专知会员服务

26+阅读 · 2019年11月23日

【ICCV 2019】基于元学习的自动化神经网络通道 MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning

【ICCV 2019】基于元学习的自动化神经网络通道 MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning

专知会员服务

17+阅读 · 2019年11月17日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

BERT 瘦身之路：Distillation，Quantization，Pruning

BERT 瘦身之路：Distillation，Quantization，Pruning

AINLP

10+阅读 · 2019年10月22日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Only Train Once: A One-Shot Neural Network Training And Pruning Framework

Arxiv

0+阅读 · 2021年11月12日

Deep Contextual Video Compression

Arxiv

5+阅读 · 2021年9月30日

Semantic Models for the First-stage Retrieval: A Comprehensive Review

Arxiv

20+阅读 · 2021年9月17日

Residual Feature Distillation Network for Lightweight Image Super-Resolution

Arxiv

4+阅读 · 2020年9月24日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Resolution Adaptive Networks for Efficient Inference

Arxiv

5+阅读 · 2020年3月16日

The Knowledge Within: Methods for Data-Free Model Compression

The Knowledge Within: Methods for Data-Free Model Compression

Arxiv

3+阅读 · 2019年12月3日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

Arxiv

4+阅读 · 2018年7月30日

微信扫码咨询专知VIP会员