分类边距分布:所有数据是否相等? (Distribution of Classification Margins: Are All Data Equal?) - 专知论文

会员服务 ·

0

边缘化 · 泛化理论 · Weight · 可约的 · Performance ·

2021 年 7 月 21 日

Distribution of Classification Margins: Are All Data Equal?

翻译：分类边距分布:所有数据是否相等?

Andrzej Banburski,Fernanda De La Torre,Nishka Pant,Ishana Shastri,Tomaso Poggio

from arxiv, Previously online as CBMM Memo 115 on the CBMM MIT site

Recent theoretical results show that gradient descent on deep neural networks under exponential loss functions locally maximizes classification margin, which is equivalent to minimizing the norm of the weight matrices under margin constraints. This property of the solution however does not fully characterize the generalization performance. We motivate theoretically and show empirically that the area under the curve of the margin distribution on the training set is in fact a good measure of generalization. We then show that, after data separation is achieved, it is possible to dynamically reduce the training set by more than 99% without significant loss of performance. Interestingly, the resulting subset of "high capacity" features is not consistent across different training runs, which is consistent with the theoretical claim that all training points should converge to the same asymptotic margin under SGD and in the presence of both batch normalization and weight decay.

翻译：最近的理论结果显示,在指数性损失功能下深神经网络中,局部的梯度下降会最大限度地增加分类差值,这相当于最大限度地减少比值限制下重量矩阵的规范。但这一解决方案的特性并不完全说明一般化业绩。我们从理论上激励并用经验表明,培训成套材料比值曲线下的区域事实上是一种很好的概括性尺度。然后我们表明,在数据分离完成后,有可能在不造成重大性能损失的情况下,动态地将培训减少99%以上。有趣的是,由此产生的“高容量”特征子集在不同的培训中并不一致,这与理论上的说法是一致的,即所有培训点都应与SGD下的相同性能差点趋同,同时存在批次正常化和重量衰减。

0

相关内容

边缘化

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【Google】具有秩-1因子的高效可扩展贝叶斯神经网络，Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

【Google】具有秩-1因子的高效可扩展贝叶斯神经网络，Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

专知会员服务

14+阅读 · 2020年5月19日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

神经网络训练tricks

神经网络训练tricks

极市平台

6+阅读 · 2019年4月15日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

已删除

将门创投

4+阅读 · 2018年5月31日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Sharp Analysis of Random Fourier Features in Classification

Arxiv

0+阅读 · 2021年9月22日

On the Estimation of Information Measures of Continuous Distributions

Arxiv

0+阅读 · 2021年9月21日

Nonparametric regression for locally stationary functional time series

Arxiv

0+阅读 · 2021年9月18日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Deep Stable Learning for Out-Of-Distribution Generalization

Arxiv

12+阅读 · 2021年4月16日

Disentangled Information Bottleneck

Disentangled Information Bottleneck

Arxiv

12+阅读 · 2020年12月22日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

Geometric Understanding of Deep Learning

Arxiv

5+阅读 · 2018年5月31日

Stable Distribution Alignment Using the Dual of the Adversarial Distance

Arxiv

3+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

相关VIP内容

分布外泛化(Out-Of-Distribution Generalization) 综述论文，22页pdf240篇文献

专知会员服务

64+阅读 · 2021年9月2日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【Google】具有秩-1因子的高效可扩展贝叶斯神经网络，Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

【Google】具有秩-1因子的高效可扩展贝叶斯神经网络，Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

专知会员服务

14+阅读 · 2020年5月19日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

兵棋系统文档：联合战区级模拟-全球行动（JTLS-GO®）

【普林斯顿博士论文】面向人本机器人学的安全与学习博弈论融合

从无人机到数据：揭示边缘计算作为新作战域

综述：机器嗅觉与嵌入式人工智能正在塑造新的全球传感产业

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

神经网络训练tricks

神经网络训练tricks

极市平台

6+阅读 · 2019年4月15日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

已删除

将门创投

4+阅读 · 2018年5月31日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Sharp Analysis of Random Fourier Features in Classification

Arxiv

0+阅读 · 2021年9月22日

On the Estimation of Information Measures of Continuous Distributions

Arxiv

0+阅读 · 2021年9月21日

Nonparametric regression for locally stationary functional time series

Arxiv

0+阅读 · 2021年9月18日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Deep Stable Learning for Out-Of-Distribution Generalization

Arxiv

12+阅读 · 2021年4月16日

Disentangled Information Bottleneck

Disentangled Information Bottleneck

Arxiv

12+阅读 · 2020年12月22日

A Survey on Distributed Machine Learning

Arxiv

45+阅读 · 2019年12月20日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

Geometric Understanding of Deep Learning

Arxiv

5+阅读 · 2018年5月31日

Stable Distribution Alignment Using the Dual of the Adversarial Distance

Arxiv

3+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员