DiffPrune: 神经网络网络,以确定性近二元门和0.00卢元的正规化为缓冲 (DiffPrune: Neural Network Pruning with Deterministic Approximate Binary Gates and $L_0$ Regularization) - 专知论文

会员服务 ·

0

binary · 正则化项 · 剪枝 · 近似 · Neural Networks ·

2021 年 3 月 6 日

DiffPrune: Neural Network Pruning with Deterministic Approximate Binary Gates and $L_0$ Regularization

翻译：DiffPrune: 神经网络网络,以确定性近二元门和0.00卢元的正规化为缓冲

Modern neural network architectures typically have many millions of parameters and can be pruned significantly without substantial loss in effectiveness which demonstrates they are over-parameterized. The contribution of this work is two-fold. The first is a method for approximating a multivariate Bernoulli random variable by means of a deterministic and differentiable transformation of any real-valued multivariate random variable. The second is a method for model selection by element-wise multiplication of parameters with approximate binary gates that may be computed deterministically or stochastically and take on exact zero values. Sparsity is encouraged by the inclusion of a surrogate regularization to the $L_0$ loss. Since the method is differentiable it enables straightforward and efficient learning of model architectures by an empirical risk minimization procedure with stochastic gradient descent and theoretically enables conditional computation during training. The method also supports any arbitrary group sparsity over parameters or activations and therefore offers a framework for unstructured or flexible structured model pruning. To conclude experiments are performed to demonstrate the effectiveness of the proposed approach.

翻译：现代神经网络结构通常具有数以百万计的参数,并且可以大量修剪,而不会产生显著的效益损失,从而证明它们具有超度参数。这项工作的贡献是双重的。第一个方法是通过确定和可区别地转换任何实际价值的多变随机变量,来接近多变贝努利随机变数。第二个是模型选择方法,通过从元素角度将参数与大约二进制门的参数进行倍增来进行计算,这些参数可以以确定或随机方式计算,并以精确的零值来计算。由于将代用正规化方法纳入$L_0损失,因此鼓励了公平性。由于该方法有差异,它能够通过实验风险最小化程序直接和高效地学习模型结构,在培训期间可以使用随机梯度梯度梯度梯度梯度下降,理论上也能够进行有条件的计算。该方法还支持在参数或激活方面任意的群集聚度,从而提供一个不结构或灵活结构化模型运行的框架。完成实验是为了证明拟议方法的有效性。

0

相关内容

binary

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

一份循环神经网络RNNs简明教程，37页ppt

一份循环神经网络RNNs简明教程，37页ppt

专知会员服务

173+阅读 · 2020年5月6日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【2019 北京智源大会】Cognitive Graph in Practice with their Applications in E-commerce Recommendation (图神经网络实践及在电子商务推荐中的应用) 杨红霞 / 阿里巴巴资深算法专家

【2019 北京智源大会】Cognitive Graph in Practice with their Applications in E-commerce Recommendation (图神经网络实践及在电子商务推荐中的应用) 杨红霞 / 阿里巴巴资深算法专家

专知会员服务

24+阅读 · 2019年11月2日

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

专知会员服务

118+阅读 · 2019年10月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Fantastic Four: Differentiable Bounds on Singular Values of Convolution Layers

Arxiv

0+阅读 · 2021年4月30日

Optimal control policies for resource allocation in the Cloud: comparison between Markov decision process and heuristic approaches

Arxiv

0+阅读 · 2021年4月30日

Efficient Spectral Methods for Quasi-Equilibrium Closure Approximations of Symmetric Problems on Unit Circle and Sphere

Arxiv

0+阅读 · 2021年4月29日

Hessian Aware Quantization of Spiking Neural Networks

Arxiv

0+阅读 · 2021年4月29日

Fourier Neural Networks as Function Approximators and Differential Equation Solvers

Arxiv

0+阅读 · 2021年4月28日

Optimal Stopping via Randomized Neural Networks

Arxiv

0+阅读 · 2021年4月28日

Active learning of tree tensor networks using optimal least-squares

Arxiv

0+阅读 · 2021年4月27日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

Billion-scale Network Embedding with Iterative Random Projection

Arxiv

5+阅读 · 2018年5月7日

Community Aware Random Walk for Network Embedding

Arxiv

4+阅读 · 2018年2月19日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

一份循环神经网络RNNs简明教程，37页ppt

一份循环神经网络RNNs简明教程，37页ppt

专知会员服务

173+阅读 · 2020年5月6日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【2019 北京智源大会】Cognitive Graph in Practice with their Applications in E-commerce Recommendation (图神经网络实践及在电子商务推荐中的应用) 杨红霞 / 阿里巴巴资深算法专家

【2019 北京智源大会】Cognitive Graph in Practice with their Applications in E-commerce Recommendation (图神经网络实践及在电子商务推荐中的应用) 杨红霞 / 阿里巴巴资深算法专家

专知会员服务

24+阅读 · 2019年11月2日

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

专知会员服务

118+阅读 · 2019年10月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Fantastic Four: Differentiable Bounds on Singular Values of Convolution Layers

Arxiv

0+阅读 · 2021年4月30日

Optimal control policies for resource allocation in the Cloud: comparison between Markov decision process and heuristic approaches

Arxiv

0+阅读 · 2021年4月30日

Efficient Spectral Methods for Quasi-Equilibrium Closure Approximations of Symmetric Problems on Unit Circle and Sphere

Arxiv

0+阅读 · 2021年4月29日

Hessian Aware Quantization of Spiking Neural Networks

Arxiv

0+阅读 · 2021年4月29日

Fourier Neural Networks as Function Approximators and Differential Equation Solvers

Arxiv

0+阅读 · 2021年4月28日

Optimal Stopping via Randomized Neural Networks

Arxiv

0+阅读 · 2021年4月28日

Active learning of tree tensor networks using optimal least-squares

Arxiv

0+阅读 · 2021年4月27日

Differential Dynamic Programming Neural Optimizer

Arxiv

7+阅读 · 2020年6月29日

Billion-scale Network Embedding with Iterative Random Projection

Arxiv

5+阅读 · 2018年5月7日

Community Aware Random Walk for Network Embedding

Arxiv

4+阅读 · 2018年2月19日

微信扫码咨询专知VIP会员