强力动态粗缩培训的重力平衡 (Gradient-based Weight Density Balancing for Robust Dynamic Sparse Training) - 专知论文

会员服务 ·

0

Weight · 层 · 稀疏 · 特化 · 稳健性 ·

2022 年 11 月 3 日

Gradient-based Weight Density Balancing for Robust Dynamic Sparse Training

翻译：强力动态粗缩培训的重力平衡

Mathias Parger,Alexander Ertl,Paul Eibensteiner,Joerg H. Mueller,Martin Winter,Markus Steinberger

Training a sparse neural network from scratch requires optimizing connections at the same time as the weights themselves. Typically, the weights are redistributed after a predefined number of weight updates, removing a fraction of the parameters of each layer and inserting them at different locations in the same layers. The density of each layer is determined using heuristics, often purely based on the size of the parameter tensor. While the connections per layer are optimized multiple times during training, the density of each layer remains constant. This leaves great unrealized potential, especially in scenarios with a high sparsity of 90% and more. We propose Global Gradient-based Redistribution, a technique which distributes weights across all layers - adding more weights to the layers that need them most. Our evaluation shows that our approach is less prone to unbalanced weight distribution at initialization than previous work and that it is able to find better performing sparse subnetworks at very high sparsity levels.

翻译：从零开始训练一个稀疏的神经网络需要与重量本身同时优化连接。通常, 重量在预设的重量更新数之后再分配, 去除每一层参数的一小部分, 并将其插入同一层的不同位置。每一层的密度都是使用超力学来决定的, 通常纯粹根据参数微量的大小来决定。虽然每层的连接在训练期间被优化过多次, 每一层的密度保持不变。这留下了巨大的未实现的潜力, 特别是在90%和90%以上高度宽度高的情况下。我们建议采用基于全球梯度的再分配技术, 一种在所有层中分配重量的技术 - 增加最需要的层的重量。我们的评估表明, 我们的方法在初始化时不太容易出现不平衡的重量分布, 并且能够发现在非常高的宽度水平上更好地运行稀疏的子网络。

0

相关内容

Weight

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

新型哒嗪酮类抗登革病毒药物先导化合物的发现、优化

国家自然科学基金

0+阅读 · 2015年12月31日

组蛋白去乙酰化酶抑制剂对骨关节炎中Notch-NFAT信号通路调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

GPER1介导的雌激素非基因组效应在脑缺血神经元损伤中的保护作用和机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

骨性关节炎MAPK-ERK1/2通路的分子学靶向治疗研究

国家自然科学基金

0+阅读 · 2012年12月31日

Stra8及其相互作用蛋白Setd8在精子发生中的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

早衰因子对β-分泌酶(BACE1)的调控与阿尔兹海默症致病机理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

锂离子电池中的混合导电介孔电极材料及其输运问题研究

国家自然科学基金

0+阅读 · 2009年12月31日

高可靠性级联式有源电力滤波器控制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

水下传感器网络关键问题研究

国家自然科学基金

4+阅读 · 2008年12月31日

负浮力射流的卷积和失稳特性及其湍流形成机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

An Efficient Black-Box Support of Advanced Coverage Criteria for Klee

Arxiv

0+阅读 · 2022年12月23日

Unlocking the potential of two-point cells for energy-efficient and resilient training of deep nets

Arxiv

0+阅读 · 2022年12月22日

KL Regularized Normalization Framework for Low Resource Tasks

Arxiv

0+阅读 · 2022年12月21日

Sequential Training of Neural Networks with Gradient Boosting

Arxiv

0+阅读 · 2022年12月20日

On the Connection between Invariant Learning and Adversarial Training for Out-of-Distribution Generalization

Arxiv

0+阅读 · 2022年12月18日

DCS-RISR: Dynamic Channel Splitting for Efficient Real-world Image Super-Resolution

Arxiv

0+阅读 · 2022年12月17日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Arxiv

15+阅读 · 2021年9月6日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Arxiv

12+阅读 · 2020年6月24日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

126+阅读 · 2022年4月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争中的无人系统：新的战争方式与新兴趋势——来自前线的印象》报告

《海上自主水面船舶远程操作中心：安全可持续运行的多维度分析》

多模态大语言模型下游调优中“保持自我”的重要性

隐身自主无人水下航行器技术如何变革水下作战并重塑海军竞争

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

An Efficient Black-Box Support of Advanced Coverage Criteria for Klee

Arxiv

0+阅读 · 2022年12月23日

Unlocking the potential of two-point cells for energy-efficient and resilient training of deep nets

Arxiv

0+阅读 · 2022年12月22日

KL Regularized Normalization Framework for Low Resource Tasks

Arxiv

0+阅读 · 2022年12月21日

Sequential Training of Neural Networks with Gradient Boosting

Arxiv

0+阅读 · 2022年12月20日

On the Connection between Invariant Learning and Adversarial Training for Out-of-Distribution Generalization

Arxiv

0+阅读 · 2022年12月18日

DCS-RISR: Dynamic Channel Splitting for Efficient Real-world Image Super-Resolution

Arxiv

0+阅读 · 2022年12月17日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Arxiv

15+阅读 · 2021年9月6日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Arxiv

12+阅读 · 2020年6月24日

相关基金

新型哒嗪酮类抗登革病毒药物先导化合物的发现、优化

国家自然科学基金

0+阅读 · 2015年12月31日

组蛋白去乙酰化酶抑制剂对骨关节炎中Notch-NFAT信号通路调控的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

GPER1介导的雌激素非基因组效应在脑缺血神经元损伤中的保护作用和机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

骨性关节炎MAPK-ERK1/2通路的分子学靶向治疗研究

国家自然科学基金

0+阅读 · 2012年12月31日

Stra8及其相互作用蛋白Setd8在精子发生中的作用机制

国家自然科学基金

0+阅读 · 2012年12月31日

早衰因子对β-分泌酶(BACE1)的调控与阿尔兹海默症致病机理的研究

国家自然科学基金

0+阅读 · 2011年12月31日

锂离子电池中的混合导电介孔电极材料及其输运问题研究

国家自然科学基金

0+阅读 · 2009年12月31日

高可靠性级联式有源电力滤波器控制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

水下传感器网络关键问题研究

国家自然科学基金

4+阅读 · 2008年12月31日

负浮力射流的卷积和失稳特性及其湍流形成机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员