民俗之外:RELU网络设计和初始化的缩放计算 (Beyond Folklore: A Scaling Calculus for the Design and Initialization of ReLU Networks) - 专知论文

会员服务 ·

0

缩放 · Weight · Neural Networks · ReLU · Networking ·

2021 年 2 月 11 日

Beyond Folklore: A Scaling Calculus for the Design and Initialization of ReLU Networks

翻译：民俗之外:RELU网络设计和初始化的缩放计算

Aaron Defazio,Léon Bottou

We propose a system for calculating a "scaling constant" for layers and weights of neural networks. We relate this scaling constant to two important quantities that relate to the optimizability of neural networks, and argue that a network that is "preconditioned" via scaling, in the sense that all weights have the same scaling constant, will be easier to train. This scaling calculus results in a number of consequences, among them the fact that the geometric mean of the fan-in and fan-out, rather than the fan-in, fan-out, or arithmetic mean, should be used for the initialization of the variance of weights in a neural network. Our system allows for the off-line design & engineering of ReLU neural networks, potentially replacing blind experimentation.

翻译：我们提出一个计算神经网络层和重量的“缩放常数”系统。我们将这一缩放常数与与神经网络的优化性相关的两个重要数量联系起来,并主张一个通过缩放“附加条件”的网络,即所有重量都有相同的缩放常数,将更容易培训。这种缩放微积分导致若干后果,其中包括扇门和扇门外的几何平均值,而不是扇门外、扇门外或算术平均值,应用于神经网络重量差异的初始化。我们的系统允许对ReLU神经网络进行离线设计和工程,并有可能取代盲人实验。

0

相关内容

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

一份循环神经网络RNNs简明教程，37页ppt

一份循环神经网络RNNs简明教程，37页ppt

专知会员服务

173+阅读 · 2020年5月6日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【南洋理工大学课程】图神经网络，Graph Neural Networks，附121页PPT

【南洋理工大学课程】图神经网络，Graph Neural Networks，附121页PPT

专知会员服务

254+阅读 · 2019年11月9日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

专知会员服务

152+阅读 · 2019年1月1日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】卷积神经网络类间不平衡问题系统研究

【推荐】卷积神经网络类间不平衡问题系统研究

机器学习研究会

6+阅读 · 2017年10月18日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

机器学习研究会

5+阅读 · 2017年9月5日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

The Parameterized Complexity of Clustering Incomplete Data

Arxiv

0+阅读 · 2021年4月7日

Manifold Optimization Assisted Gaussian Variational Approximation

Arxiv

0+阅读 · 2021年4月6日

Counterexamples to the classical Central Limit Theorem for triplewise independent random variables having a common arbitrary margin

Arxiv

0+阅读 · 2021年4月6日

A cell-centered Lagrangian ADER-MOOD finite volume scheme on unstructured meshes for a class of hyper-elasticity models

Arxiv

0+阅读 · 2021年4月5日

Optimal Scaling of MCMC Beyond Metropolis

Arxiv

0+阅读 · 2021年4月5日

A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network

Arxiv

0+阅读 · 2021年4月4日

Multi-core Fiber and Power-limited Optical Network Topology Optimization with MILP

Arxiv

0+阅读 · 2021年4月2日

Understanding the Effects of Data Parallelism and Sparsity on Neural Network Training

Arxiv

0+阅读 · 2021年4月2日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

一份循环神经网络RNNs简明教程，37页ppt

一份循环神经网络RNNs简明教程，37页ppt

专知会员服务

173+阅读 · 2020年5月6日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【南洋理工大学课程】图神经网络，Graph Neural Networks，附121页PPT

【南洋理工大学课程】图神经网络，Graph Neural Networks，附121页PPT

专知会员服务

254+阅读 · 2019年11月9日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

【《图解深度学习》电子书与代码，830页pdf】’Deep Learning Illustrated (2019)' by Deep Learning Study Group GitHub

专知会员服务

152+阅读 · 2019年1月1日

热门VIP内容

开通专知VIP会员享更多权益服务

《巡飞弹药（爆炸性无人机）威胁态势分析》最新24页报告

《军用后勤无人机：破解战场运输挑战的创新方案》

人工智能战争：以色列、伊朗与新型AI战争形态

《俄乌战争：现代战争未来的启示与经验》

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

分布式并行架构Ray介绍

分布式并行架构Ray介绍

CreateAMind

10+阅读 · 2019年8月9日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】卷积神经网络类间不平衡问题系统研究

【推荐】卷积神经网络类间不平衡问题系统研究

机器学习研究会

6+阅读 · 2017年10月18日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

【推荐】神经网络调试经验汇编：神经网络不好使该咋办？

机器学习研究会

5+阅读 · 2017年9月5日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

The Parameterized Complexity of Clustering Incomplete Data

Arxiv

0+阅读 · 2021年4月7日

Manifold Optimization Assisted Gaussian Variational Approximation

Arxiv

0+阅读 · 2021年4月6日

Counterexamples to the classical Central Limit Theorem for triplewise independent random variables having a common arbitrary margin

Arxiv

0+阅读 · 2021年4月6日

A cell-centered Lagrangian ADER-MOOD finite volume scheme on unstructured meshes for a class of hyper-elasticity models

Arxiv

0+阅读 · 2021年4月5日

Optimal Scaling of MCMC Beyond Metropolis

Arxiv

0+阅读 · 2021年4月5日

A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network

Arxiv

0+阅读 · 2021年4月4日

Multi-core Fiber and Power-limited Optical Network Topology Optimization with MILP

Arxiv

0+阅读 · 2021年4月2日

Understanding the Effects of Data Parallelism and Sparsity on Neural Network Training

Arxiv

0+阅读 · 2021年4月2日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

微信扫码咨询专知VIP会员