前景审慎:利用元梯度在初始化时寻找可训练的重量 (Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients) - 专知论文

会员服务 ·

0

剪枝 · Weight · 可辨认的 · 优化器 · Networking ·

2022 年 2 月 16 日

Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients

翻译：前景审慎:利用元梯度在初始化时寻找可训练的重量

Milad Alizadeh,Shyam A. Tailor,Luisa M Zintgraf,Joost van Amersfoort,Sebastian Farquhar,Nicholas Donald Lane,Yarin Gal

Pruning neural networks at initialization would enable us to find sparse models that retain the accuracy of the original network while consuming fewer computational resources for training and inference. However, current methods are insufficient to enable this optimization and lead to a large degradation in model performance. In this paper, we identify a fundamental limitation in the formulation of current methods, namely that their saliency criteria look at a single step at the start of training without taking into account the trainability of the network. While pruning iteratively and gradually has been shown to improve pruning performance, explicit consideration of the training stage that will immediately follow pruning has so far been absent from the computation of the saliency criterion. To overcome the short-sightedness of existing methods, we propose Prospect Pruning (ProsPr), which uses meta-gradients through the first few steps of optimization to determine which weights to prune. ProsPr combines an estimate of the higher-order effects of pruning on the loss and the optimization trajectory to identify the trainable sub-network. Our method achieves state-of-the-art pruning performance on a variety of vision classification tasks, with less data and in a single shot compared to existing pruning-at-initialization methods.

翻译：初始化时,我们发现保持原始网络准确性、同时消耗较少的计算培训和推算资源的模式稀少,然而,目前的方法不足以优化这种优化,导致模型性能的大幅退化。在本文件中,我们确定目前方法的制定存在一个根本性的局限性,即其突出标准在培训开始时只看一个步骤,而没有考虑到网络的可训练性。虽然模拟和逐步显示可以改进运行性能,但是在计算突出标准时,迄今没有明确考虑将紧接运行后的培训阶段。为了克服现有方法的短视性,我们提议Prospect Prutning(Prospirning)(ProsprPrrr),该方法通过最初几个优化步骤使用元化方法来确定对普鲁纳的权重。ProsPr(ProsPr)将估计对损失的测序和优化轨迹的测算结果结合起来,以便确定可训练的子网络。我们的方法在对现有各种视觉性定型任务中,在比较一次直观性的工作上,以较慢的数据和较慢的原始的方法取得了状态。

0

相关内容

《5G+智慧农业解决方案》22页PPT，三昇农业

《5G+智慧农业解决方案》22页PPT，三昇农业

专知会员服务

55+阅读 · 2022年3月23日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

长非编码RNA在Her2阳性乳腺癌中的调控作用

国家自然科学基金

0+阅读 · 2014年12月31日

几类随机分数阶复杂网络的参数及状态估计问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向绿色通信的自我修复射频电路关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向智能电网环境的电力系统安全约束动态经济调度方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

类年龄结构与免疫-传染病耦合系统建模与研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于有限容量Petri网的离散事件系统监控理论

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

大肠杆菌K1外膜蛋白A特异结构在其导致新生儿细菌性脑膜炎中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于业务风险的智能电网通信端到端QoS保障及评估模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

NLS1介导的转录因子Arx在细胞核定位的分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

A novel three-stage training strategy for long-tailed classification

Arxiv

0+阅读 · 2022年4月20日

Jacobian Ensembles Improve Robustness Trade-offs to Adversarial Attacks

Arxiv

0+阅读 · 2022年4月19日

MP2: A Momentum Contrast Approach for Recommendation with Pointwise and Pairwise Learning

Arxiv

0+阅读 · 2022年4月18日

A Convergence Analysis of Nesterov's Accelerated Gradient Method in Training Deep Linear Neural Networks

Arxiv

0+阅读 · 2022年4月18日

Fast Multi-grid Methods for Minimizing Curvature Energy

Arxiv

0+阅读 · 2022年4月17日

One-Cycle Pruning: Pruning ConvNets Under a Tight Training Budget

Arxiv

0+阅读 · 2022年4月16日

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Arxiv

0+阅读 · 2022年4月15日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Arxiv

16+阅读 · 2020年3月12日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

VIP会员

文章信息

相关主题

相关VIP内容

《5G+智慧农业解决方案》22页PPT，三昇农业

《5G+智慧农业解决方案》22页PPT，三昇农业

专知会员服务

55+阅读 · 2022年3月23日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

A novel three-stage training strategy for long-tailed classification

Arxiv

0+阅读 · 2022年4月20日

Jacobian Ensembles Improve Robustness Trade-offs to Adversarial Attacks

Arxiv

0+阅读 · 2022年4月19日

MP2: A Momentum Contrast Approach for Recommendation with Pointwise and Pairwise Learning

Arxiv

0+阅读 · 2022年4月18日

A Convergence Analysis of Nesterov's Accelerated Gradient Method in Training Deep Linear Neural Networks

Arxiv

0+阅读 · 2022年4月18日

Fast Multi-grid Methods for Minimizing Curvature Energy

Arxiv

0+阅读 · 2022年4月17日

One-Cycle Pruning: Pruning ConvNets Under a Tight Training Budget

Arxiv

0+阅读 · 2022年4月16日

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning

Arxiv

0+阅读 · 2022年4月15日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Arxiv

16+阅读 · 2020年3月12日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

相关基金

长非编码RNA在Her2阳性乳腺癌中的调控作用

国家自然科学基金

0+阅读 · 2014年12月31日

几类随机分数阶复杂网络的参数及状态估计问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向绿色通信的自我修复射频电路关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向智能电网环境的电力系统安全约束动态经济调度方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

类年龄结构与免疫-传染病耦合系统建模与研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于有限容量Petri网的离散事件系统监控理论

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

大肠杆菌K1外膜蛋白A特异结构在其导致新生儿细菌性脑膜炎中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于业务风险的智能电网通信端到端QoS保障及评估模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

NLS1介导的转录因子Arx在细胞核定位的分子机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员