深层学习中理解普遍化的神经障碍的局限性 (Limitations of Neural Collapse for Understanding Generalization in Deep Learning) - 专知论文

会员服务 ·

0

泛化理论 · 可理解性 · 训练集 · 优化器 · 层 ·

2022 年 2 月 17 日

Limitations of Neural Collapse for Understanding Generalization in Deep Learning

翻译：深层学习中理解普遍化的神经障碍的局限性

Like Hui,Mikhail Belkin,Preetum Nakkiran

The recent work of Papyan, Han, & Donoho (2020) presented an intriguing "Neural Collapse" phenomenon, showing a structural property of interpolating classifiers in the late stage of training. This opened a rich area of exploration studying this phenomenon. Our motivation is to study the upper limits of this research program: How far will understanding Neural Collapse take us in understanding deep learning? First, we investigate its role in generalization. We refine the Neural Collapse conjecture into two separate conjectures: collapse on the train set (an optimization property) and collapse on the test distribution (a generalization property). We find that while Neural Collapse often occurs on the train set, it does not occur on the test set. We thus conclude that Neural Collapse is primarily an optimization phenomenon, with as-yet-unclear connections to generalization. Second, we investigate the role of Neural Collapse in feature learning. We show simple, realistic experiments where training longer leads to worse last-layer features, as measured by transfer-performance on a downstream task. This suggests that neural collapse is not always desirable for representation learning, as previously claimed. Finally, we give preliminary evidence of a "cascading collapse" phenomenon, wherein some form of Neural Collapse occurs not only for the last layer, but in earlier layers as well. We hope our work encourages the community to continue the rich line of Neural Collapse research, while also considering its inherent limitations.

翻译：Papyan, Han, & Donoho (2020年) 最近的Papyan, Han, & Donoho (2020年) 工作展示了一个令人着迷的“ Neal Clomp” 现象, 展示了在培训后期交错分类者的结构属性。这打开了研究这一现象的丰富探索领域。我们的动机是研究这个研究方案的上限: 理解神经下层的崩溃将让我们深入了解多少? 首先, 我们调查它在概括化中的作用。我们将其神经下游的神经下折叠猜想改进成两个不同的猜想: 火车机组的倒塌(优化属性)和测试分布的崩溃(一般属性 ) 。我们发现,在火车组经常发生内层倒塌时, 它并不发生在测试组。因此我们的结论是,神经倒塌主要是一种优化现象, 与一般化之间没有明确的联系。其次,我们调查神经倒塌的作用在特征学习中。我们展示了简单、现实的实验, 将培训更深层次的特征推向更坏的特征, 以我们下游任务的转移表现来衡量。这表明神经崩溃并非总是适宜于表, 最后的层次, 开始学习。最后的层次, 最后的阶段形成, 最后的阶段的形成, 最终的阶段, 最后的形成, 最终的形成, 最后的阶段, 最终的形成。

0

相关内容

泛化理论

【ICML2021】核持续学习，Kernel Continual Learning

专知会员服务

32+阅读 · 2021年7月15日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ICML'21 | 五篇图神经网络论文精选

ICML'21 | 五篇图神经网络论文精选

图与推荐

1+阅读 · 2021年10月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

高斯序列与过程的极值理论

国家自然科学基金

2+阅读 · 2015年12月31日

应用数学暑期学校（2015）

国家自然科学基金

5+阅读 · 2015年7月12日

复杂散射机制场景的SAR图像认知方法研究

国家自然科学基金

3+阅读 · 2014年12月31日

微分多项式分解的算法和理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向人类视觉感知的高分辨率遥感图像检索研究

国家自然科学基金

0+阅读 · 2012年12月31日

顶点算子代数理论及李代数的表示

国家自然科学基金

1+阅读 · 2012年12月31日

Overlay结构特性对网络攻击的影响的仿真分析

国家自然科学基金

0+阅读 · 2010年12月31日

生产线场景复杂光照条件下钢坯在线检测识别理论及关键技术

国家自然科学基金

0+阅读 · 2009年12月31日

复杂场景光流场计算的鲁棒性和病态问题分析

国家自然科学基金

0+阅读 · 2009年12月31日

克里佛德代数结构框架下高维空间中若干问题的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Understanding and Preventing Capacity Loss in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

Effects of Graph Convolutions in Deep Networks

Arxiv

0+阅读 · 2022年4月20日

Learning Convolutional Neural Networks in the Frequency Domain

Arxiv

0+阅读 · 2022年4月19日

Theory of Graph Neural Networks: Representation and Learning

Arxiv

4+阅读 · 2022年4月16日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

VIP会员

文章信息

相关主题

相关VIP内容

【ICML2021】核持续学习，Kernel Continual Learning

专知会员服务

32+阅读 · 2021年7月15日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

ICML'21 | 五篇图神经网络论文精选

ICML'21 | 五篇图神经网络论文精选

图与推荐

1+阅读 · 2021年10月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Understanding and Preventing Capacity Loss in Reinforcement Learning

Arxiv

0+阅读 · 2022年4月20日

Effects of Graph Convolutions in Deep Networks

Arxiv

0+阅读 · 2022年4月20日

Learning Convolutional Neural Networks in the Frequency Domain

Arxiv

0+阅读 · 2022年4月19日

Theory of Graph Neural Networks: Representation and Learning

Arxiv

4+阅读 · 2022年4月16日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Arxiv

11+阅读 · 2021年9月3日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

Learning with Interpretable Structure from RNN

Arxiv

19+阅读 · 2018年10月25日

Deep Representation Learning for Domain Adaptation of Semantic Image Segmentation

Arxiv

10+阅读 · 2018年5月10日

相关基金

高斯序列与过程的极值理论

国家自然科学基金

2+阅读 · 2015年12月31日

应用数学暑期学校（2015）

国家自然科学基金

5+阅读 · 2015年7月12日

复杂散射机制场景的SAR图像认知方法研究

国家自然科学基金

3+阅读 · 2014年12月31日

微分多项式分解的算法和理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

面向人类视觉感知的高分辨率遥感图像检索研究

国家自然科学基金

0+阅读 · 2012年12月31日

顶点算子代数理论及李代数的表示

国家自然科学基金

1+阅读 · 2012年12月31日

Overlay结构特性对网络攻击的影响的仿真分析

国家自然科学基金

0+阅读 · 2010年12月31日

生产线场景复杂光照条件下钢坯在线检测识别理论及关键技术

国家自然科学基金

0+阅读 · 2009年12月31日

复杂场景光流场计算的鲁棒性和病态问题分析

国家自然科学基金

0+阅读 · 2009年12月31日

克里佛德代数结构框架下高维空间中若干问题的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员