深神经网络最小深度的隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐 (On the Implicit Bias Towards Minimal Depth of Deep Neural Networks) - 专知论文

会员服务 ·

0

层 · Neural Networks · 有偏 · Networking · 随机梯度下降 ·

2022 年 4 月 26 日

On the Implicit Bias Towards Minimal Depth of Deep Neural Networks

翻译：深神经网络最小深度的隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐隐

Tomer Galanti,Liane Galanti

We study the implicit bias of stochastic gradient descent to favor low-depth solutions when training deep neural networks. Recent results in the literature suggest that penultimate layer representations learned by a classifier over multiple classes exhibit a clustering property, called neural collapse. First, we empirically show that neural collapse generally strengthens when increasing the number of layers. In addition, we demonstrate that neural collapse extends beyond the penultimate layer and emerges in intermediate layers as well, making the higher layers essentially redundant. We characterize a notion of effective depth which measures the minimal layer that enjoys neural collapse. In this regard, we hypothesize and empirically show that gradient descent implicitly selects neural networks of small effective depths. Finally, we theoretically and empirically show that the effective depth of a trained neural network monotonically increases when training with extended portions of random labels and connecting it with generalization.

翻译：我们研究的是深神经网络培训中隐含的随机梯度下降偏向偏向于低深度解决方案的偏向。最近文献中的结果表明,一个分类者在多类上的倒数第二层展示出一个群状属性,称为神经崩溃。首先,我们从经验上表明,当增加层数时,神经崩溃一般会加剧。此外,我们证明神经崩溃超越倒数第二层,在中间层也会出现,使较高层基本上变得多余。我们给出了一个有效深度概念,以测量神经崩溃的最小层。在这方面,我们虚度和实验性地表明,梯度下降隐含地选择了小有效深度的神经网络。最后,我们从理论上和实验上表明,经过训练的神经网络的有效深度在培训时会单元化地增加,而通过随机标签的延伸部分与一般化联系起来。

0

相关内容

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

专知会员服务

33+阅读 · 2022年6月13日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

基于混合约束正则化的电阻抗成像反演研究

国家自然科学基金

0+阅读 · 2015年12月31日

高阶微分方程的周期解及多重性

国家自然科学基金

0+阅读 · 2015年12月31日

水力梯度作用下填埋场压实黏土盖层开裂失效机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

miRNA-134介导的突触可塑性调节在抑郁症发病中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

miRNA-205介导炎症网络调控乳腺癌转移的机制探索

国家自然科学基金

0+阅读 · 2012年12月31日

miRNAs调控移植静脉新生内膜增生的力学生物学机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

聚噻吩衍生物共轭桥联二维网络结构设计及带隙和空穴迁移率的调控

国家自然科学基金

0+阅读 · 2011年12月31日

几何阻挫体系ATO2中自旋、电荷、轨道序及其相互作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

高脂饮食诱导IR大鼠DGAT2活性与APN、Lp相关性及化浊解毒中药的干预研究

国家自然科学基金

0+阅读 · 2009年12月31日

Benign Overfitting in Two-layer Convolutional Neural Networks

Benign Overfitting in Two-layer Convolutional Neural Networks

Arxiv

0+阅读 · 2022年6月14日

Scaling ResNets in the Large-depth Regime

Arxiv

0+阅读 · 2022年6月14日

Support Vectors and Gradient Dynamics of Single-Neuron ReLU Networks

Arxiv

0+阅读 · 2022年6月13日

SGD Noise and Implicit Low-Rank Bias in Deep Neural Networks

Arxiv

0+阅读 · 2022年6月12日

Data Augmentation for Intent Classification

Arxiv

0+阅读 · 2022年6月12日

Learning the Space of Deep Models

Arxiv

0+阅读 · 2022年6月10日

Bayesian Inference of Stochastic Dynamical Networks

Arxiv

0+阅读 · 2022年6月10日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

VIP会员

文章信息

相关主题

Neural Networks

随机梯度下降

相关VIP内容

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

专知会员服务

33+阅读 · 2022年6月13日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Benign Overfitting in Two-layer Convolutional Neural Networks

Benign Overfitting in Two-layer Convolutional Neural Networks

Arxiv

0+阅读 · 2022年6月14日

Scaling ResNets in the Large-depth Regime

Arxiv

0+阅读 · 2022年6月14日

Support Vectors and Gradient Dynamics of Single-Neuron ReLU Networks

Arxiv

0+阅读 · 2022年6月13日

SGD Noise and Implicit Low-Rank Bias in Deep Neural Networks

Arxiv

0+阅读 · 2022年6月12日

Data Augmentation for Intent Classification

Arxiv

0+阅读 · 2022年6月12日

Learning the Space of Deep Models

Arxiv

0+阅读 · 2022年6月10日

Bayesian Inference of Stochastic Dynamical Networks

Arxiv

0+阅读 · 2022年6月10日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

相关基金

基于混合约束正则化的电阻抗成像反演研究

国家自然科学基金

0+阅读 · 2015年12月31日

高阶微分方程的周期解及多重性

国家自然科学基金

0+阅读 · 2015年12月31日

水力梯度作用下填埋场压实黏土盖层开裂失效机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

miRNA-134介导的突触可塑性调节在抑郁症发病中的作用及机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

miRNA-205介导炎症网络调控乳腺癌转移的机制探索

国家自然科学基金

0+阅读 · 2012年12月31日

miRNAs调控移植静脉新生内膜增生的力学生物学机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

聚噻吩衍生物共轭桥联二维网络结构设计及带隙和空穴迁移率的调控

国家自然科学基金

0+阅读 · 2011年12月31日

几何阻挫体系ATO2中自旋、电荷、轨道序及其相互作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

高脂饮食诱导IR大鼠DGAT2活性与APN、Lp相关性及化浊解毒中药的干预研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员