培训显示神经网络复杂程度 (What training reveals about neural network complexity) - 专知论文

会员服务 ·

0

Lipschitz常数 · 输入空间 · Lipschitz · 泛化理论 · Neural Networks ·

2021 年 10 月 29 日

What training reveals about neural network complexity

翻译：培训显示神经网络复杂程度

Andreas Loukas,Marinos Poiitis,Stefanie Jegelka

from arxiv, Published at NeurIPS 2021

This work explores the Benevolent Training Hypothesis (BTH) which argues that the complexity of the function a deep neural network (NN) is learning can be deduced by its training dynamics. Our analysis provides evidence for BTH by relating the NN's Lipschitz constant at different regions of the input space with the behavior of the stochastic training procedure. We first observe that the Lipschitz constant close to the training data affects various aspects of the parameter trajectory, with more complex networks having a longer trajectory, bigger variance, and often veering further from their initialization. We then show that NNs whose 1st layer bias is trained more steadily (i.e., slowly and with little variation) have bounded complexity even in regions of the input space that are far from any training point. Finally, we find that steady training with Dropout implies a training- and data-dependent generalization bound that grows poly-logarithmically with the number of parameters. Overall, our results support the intuition that good training behavior can be a useful bias towards good generalization.

翻译：这项工作探索了福利培训假说(BTH),认为深神经网络(NN)正在学习的功能的复杂性可以通过培训动态推断出来。我们的分析通过将输入空间不同区域的NN的Lipschitz常数与随机培训程序的行为联系起来,为BTH提供了证据。我们首先发现,Lipschitz的常数接近培训数据会影响参数轨迹的各个方面,而更复杂的网络的轨道较长,差异更大,而且往往从初始化开始就更进一步。我们随后表明,即使在远离任何培训点的输入空间的区域,其一层偏差得到更稳定的训练(即缓慢和几乎没有变异)的非NNCs的复杂程度也相交织。最后,我们发现,与辍学有关的稳步培训意味着一个培训和数据依赖的概括性约束,与参数数成多对数。总体而言,我们的结果支持了这样的直觉,即良好的培训行为可能是向好的一般化方向的有用偏差。

0

相关内容

Lipschitz常数

Lipschitz常数

【干货书】机器学习算法视角，249页pdf

【干货书】机器学习算法视角，249页pdf

专知会员服务

144+阅读 · 2021年10月18日

【伯克利博士论文】深度强化学习的探索与安全性，178页pdf

专知会员服务

77+阅读 · 2021年5月23日

因果推理发展综述《The Development of Causal Reasoning》，附41页PDF下载

因果推理发展综述《The Development of Causal Reasoning》，附41页PDF下载

专知会员服务

111+阅读 · 2020年11月28日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

现代机器学习技术导论，596页pdf

专知会员服务

168+阅读 · 2020年7月27日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

10分钟搞懂反向传播| Neural Networks #13

10分钟搞懂反向传播| Neural Networks #13

AI研习社

3+阅读 · 2018年1月7日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks

Arxiv

5+阅读 · 2021年2月21日

Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks

Arxiv

6+阅读 · 2020年6月15日

How Useful is Self-Supervised Pretraining for Visual Tasks?

How Useful is Self-Supervised Pretraining for Visual Tasks?

Arxiv

9+阅读 · 2020年3月31日

What Can Neural Networks Reason About?

Arxiv

10+阅读 · 2020年2月15日

A Comparison of Neural Network Training Methods for Text Classification

Arxiv

6+阅读 · 2019年10月28日

Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning

Arxiv

3+阅读 · 2018年12月6日

Neural Ordinary Differential Equations

Arxiv

6+阅读 · 2018年10月3日

Training behavior of deep neural network in frequency domain

Training behavior of deep neural network in frequency domain

Arxiv

4+阅读 · 2018年8月21日

Reducing Parameter Space for Neural Network Training

Arxiv

3+阅读 · 2018年8月17日

VIP会员

文章信息

相关主题

Lipschitz常数

Neural Networks

相关VIP内容

【干货书】机器学习算法视角，249页pdf

【干货书】机器学习算法视角，249页pdf

专知会员服务

144+阅读 · 2021年10月18日

【伯克利博士论文】深度强化学习的探索与安全性，178页pdf

专知会员服务

77+阅读 · 2021年5月23日

因果推理发展综述《The Development of Causal Reasoning》，附41页PDF下载

因果推理发展综述《The Development of Causal Reasoning》，附41页PDF下载

专知会员服务

111+阅读 · 2020年11月28日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

现代机器学习技术导论，596页pdf

专知会员服务

168+阅读 · 2020年7月27日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

10分钟搞懂反向传播| Neural Networks #13

10分钟搞懂反向传播| Neural Networks #13

AI研习社

3+阅读 · 2018年1月7日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Neural Architecture Search without Training

Neural Architecture Search without Training

Arxiv

10+阅读 · 2021年6月11日

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks

Arxiv

5+阅读 · 2021年2月21日

Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks

Arxiv

6+阅读 · 2020年6月15日

How Useful is Self-Supervised Pretraining for Visual Tasks?

How Useful is Self-Supervised Pretraining for Visual Tasks?

Arxiv

9+阅读 · 2020年3月31日

What Can Neural Networks Reason About?

Arxiv

10+阅读 · 2020年2月15日

A Comparison of Neural Network Training Methods for Text Classification

Arxiv

6+阅读 · 2019年10月28日

Logically-Constrained Reinforcement Learning

Logically-Constrained Reinforcement Learning

Arxiv

3+阅读 · 2018年12月6日

Neural Ordinary Differential Equations

Arxiv

6+阅读 · 2018年10月3日

Training behavior of deep neural network in frequency domain

Training behavior of deep neural network in frequency domain

Arxiv

4+阅读 · 2018年8月21日

Reducing Parameter Space for Neural Network Training

Arxiv

3+阅读 · 2018年8月17日

微信扫码咨询专知VIP会员