中下限 (Learned Threshold Pruning) - 专知论文

会员服务 ·

0

剪枝 · 阈值 · Networking · 模型评估 · ImageNet (数据集) ·

2021 年 3 月 19 日

Learned Threshold Pruning

翻译：中下限

Kambiz Azarian,Yash Bhalgat,Jinwon Lee,Tijmen Blankevoort

This paper presents a novel differentiable method for unstructured weight pruning of deep neural networks. Our learned-threshold pruning (LTP) method learns per-layer thresholds via gradient descent, unlike conventional methods where they are set as input. Making thresholds trainable also makes LTP computationally efficient, hence scalable to deeper networks. For example, it takes $30$ epochs for LTP to prune ResNet50 on ImageNet by a factor of $9.1$. This is in contrast to other methods that search for per-layer thresholds via a computationally intensive iterative pruning and fine-tuning process. Additionally, with a novel differentiable $L_0$ regularization, LTP is able to operate effectively on architectures with batch-normalization. This is important since $L_1$ and $L_2$ penalties lose their regularizing effect in networks with batch-normalization. Finally, LTP generates a trail of progressively sparser networks from which the desired pruned network can be picked based on sparsity and performance requirements. These features allow LTP to achieve competitive compression rates on ImageNet networks such as AlexNet ($26.4\times$ compression with $79.1\%$ Top-5 accuracy) and ResNet50 ($9.1\times$ compression with $92.0\%$ Top-5 accuracy). We also show that LTP effectively prunes modern \textit{compact} architectures, such as EfficientNet, MobileNetV2 and MixNet.

翻译：本文为深神经网络的不结构重量调整提供了一种新颖的不同方法。我们所学的超值调整法(LTP)通过渐渐下降来学习每层的阈值, 不同于通常设定为输入的常规方法。使阈值可训练也使LTP具有计算效率, 因而可扩至更深网络。例如, LTP需要30美元, 才能在图像网络上将ResNet50 压缩成一个9.1美元的系数。这与通过计算密集的迭接运行和微调程序来搜索每层阈值的其他方法不同。此外, 与新颖的 $L_0 正规化方法不同, LTP 能够在结构上以批次正常化的方式有效操作LTP。这很重要, 因为$1美元和$2美元罚款在批次规范化的网络上失去了正常化效果。最后, LTP 生成了一个逐渐稀薄的网络线索, 从中可以根据微调和性能要求选取所需的纯度网络。这些特征使得LTP能够达到具有竞争力的IMT$, 在图像网络上,例如 AS- IML1 IML1 IMU AS AS AS AS 和 AS AS AS AS IMULIS AS AS AS AS AS IM AS IM IM IM AS AS AS AS AS AS AS IM AS AS AS AS AS AS AS AS AS AS AS AS AS AS AS IS AS AS AS IM IM IM IM IM IS IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM IM

0

相关内容

Google-EfficientNet v2来了！更快，更小，更强！

Google-EfficientNet v2来了！更快，更小，更强！

专知会员服务

19+阅读 · 2021年4月4日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

专知会员服务

52+阅读 · 2019年11月1日

动手学深度学习Dive into Deep Learning中英文版本（附全套代码）

动手学深度学习Dive into Deep Learning中英文版本（附全套代码）

专知会员服务

110+阅读 · 2019年10月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

6+阅读 · 2019年1月2日

Momentum Residual Neural Networks

Arxiv

7+阅读 · 2021年5月13日

Model Pruning Based on Quantified Similarity of Feature Maps

Arxiv

0+阅读 · 2021年5月13日

Adapting by Pruning: A Case Study on BERT

Arxiv

0+阅读 · 2021年5月7日

Network Pruning That Matters: A Case Study on Retraining Variants

Arxiv

0+阅读 · 2021年5月7日

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

Arxiv

0+阅读 · 2021年5月6日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Resolution Adaptive Networks for Efficient Inference

Arxiv

5+阅读 · 2020年3月16日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

Arxiv

4+阅读 · 2019年5月9日

BlockDrop: Dynamic Inference Paths in Residual Networks

Arxiv

6+阅读 · 2018年3月30日

VIP会员

文章信息

相关主题

ImageNet (数据集)

相关VIP内容

Google-EfficientNet v2来了！更快，更小，更强！

Google-EfficientNet v2来了！更快，更小，更强！

专知会员服务

19+阅读 · 2021年4月4日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

【论文推荐】基于BERT修剪的问答模型（Pruning a BERT-based Question Answering Model）

专知会员服务

30+阅读 · 2019年11月22日

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

【麻省理工学院课程】MIT 6.S094: Deep Learning for Self-Driving Cars，深度学习和自动驾驶课程

专知会员服务

52+阅读 · 2019年11月1日

动手学深度学习Dive into Deep Learning中英文版本（附全套代码）

动手学深度学习Dive into Deep Learning中英文版本（附全套代码）

专知会员服务

110+阅读 · 2019年10月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

已删除

将门创投

6+阅读 · 2019年1月2日

相关论文

Momentum Residual Neural Networks

Arxiv

7+阅读 · 2021年5月13日

Model Pruning Based on Quantified Similarity of Feature Maps

Arxiv

0+阅读 · 2021年5月13日

Adapting by Pruning: A Case Study on BERT

Arxiv

0+阅读 · 2021年5月7日

Network Pruning That Matters: A Case Study on Retraining Variants

Arxiv

0+阅读 · 2021年5月7日

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

Arxiv

0+阅读 · 2021年5月6日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Resolution Adaptive Networks for Efficient Inference

Arxiv

5+阅读 · 2020年3月16日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

Arxiv

4+阅读 · 2019年5月9日

BlockDrop: Dynamic Inference Paths in Residual Networks

Arxiv

6+阅读 · 2018年3月30日

微信扫码咨询专知VIP会员