具有正规化特征的神经崩溃:关于里曼曼形的几何分析</s> (Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold) - 专知论文

会员服务 ·

0

规范化的 · 流形 · 层 · Analysis · 非凸 ·

2023 年 3 月 7 日

Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold

翻译：具有正规化特征的神经崩溃:关于里曼曼形的几何分析

Can Yaras,Peng Wang,Zhihui Zhu,Laura Balzano,Qing Qu

from arxiv, The first two authors contributed to this work equally; 38 pages, 13 figures. Accepted at NeurIPS'22

When training overparameterized deep networks for classification tasks, it has been widely observed that the learned features exhibit a so-called "neural collapse" phenomenon. More specifically, for the output features of the penultimate layer, for each class the within-class features converge to their means, and the means of different classes exhibit a certain tight frame structure, which is also aligned with the last layer's classifier. As feature normalization in the last layer becomes a common practice in modern representation learning, in this work we theoretically justify the neural collapse phenomenon for normalized features. Based on an unconstrained feature model, we simplify the empirical loss function in a multi-class classification task into a nonconvex optimization problem over the Riemannian manifold by constraining all features and classifiers over the sphere. In this context, we analyze the nonconvex landscape of the Riemannian optimization problem over the product of spheres, showing a benign global landscape in the sense that the only global minimizers are the neural collapse solutions while all other critical points are strict saddles with negative curvature. Experimental results on practical deep networks corroborate our theory and demonstrate that better representations can be learned faster via feature normalization.

翻译：当培训过分的深度分类任务深层网络时,人们广泛观察到,所学的特征表现出所谓的“神经崩溃”现象。更具体地说,对于倒数第二层的产出特征,每类的内层特征与其手段汇合,而不同类的手段则呈现某种紧凑的框架结构,这也与最后一层的分类器相一致。随着上层的特征正常化成为现代代表性学习的一个常见做法,在这项工作中,我们理论上证明神经崩溃现象是正常特征的典型。基于一个不受限制的特点模型,我们通过限制所有特征和分层,将多级分类任务中的经验性损失函数简化为里曼尼亚多层的非康克斯优化问题。在这方面,我们分析了里曼最优化问题的非康韦克斯景观,从一种良好的全球景观中可以看出,只有全球最小化者才是神经崩溃的解决方案,而所有其他关键点都是严格的负曲线。关于实际深层网络的实验结果证实了我们的理论,并表明,可以通过更快速的正规化特征来学习更好的表现。</s>

0

相关内容

规范化的

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

一类离散Hindmarsh-Rose模型的分支延拓

国家自然科学基金

0+阅读 · 2015年12月31日

带有噪声扰动的动力系统分支问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

Heisenberg群与Minkowski空间中的非线性椭圆方程

国家自然科学基金

0+阅读 · 2014年12月31日

约束集值优化问题的适定性研究及相关分析

国家自然科学基金

0+阅读 · 2013年12月31日

动力系统的可积、分支与嵌入流

国家自然科学基金

0+阅读 · 2012年12月31日

有理动力系统中的拓扑和拟共形几何

国家自然科学基金

1+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

复动力系统及其应用

国家自然科学基金

0+阅读 · 2011年12月31日

不可压Navier-Stokes方程的适定性与正则性研究

国家自然科学基金

0+阅读 · 2009年12月31日

A diffusion approach to Stein's method on Riemannian manifolds

Arxiv

0+阅读 · 2023年4月27日

Coded matrix computation with gradient coding

Arxiv

0+阅读 · 2023年4月26日

The Nonlocal Neural Operator: Universal Approximation

Arxiv

0+阅读 · 2023年4月26日

BO-ICP: Initialization of Iterative Closest Point Based on Bayesian Optimization

Arxiv

0+阅读 · 2023年4月25日

Directed Graph Embeddings in Pseudo-Riemannian Manifolds

Arxiv

12+阅读 · 2021年6月16日

GAN Inversion: A Survey

Arxiv

19+阅读 · 2021年1月14日

Overcoming Catastrophic Forgetting in Graph Neural Networks

Arxiv

14+阅读 · 2020年12月10日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Graph Convolutional Networks for Text Classification

Arxiv

11+阅读 · 2018年10月17日

HyperGCN: Hypergraph Convolutional Networks for Semi-Supervised Classification

HyperGCN: Hypergraph Convolutional Networks for Semi-Supervised Classification

Arxiv

13+阅读 · 2018年9月7日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

MoCoGAN 分解运动和内容的视频生成

MoCoGAN 分解运动和内容的视频生成

CreateAMind

18+阅读 · 2017年10月21日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

A diffusion approach to Stein's method on Riemannian manifolds

Arxiv

0+阅读 · 2023年4月27日

Coded matrix computation with gradient coding

Arxiv

0+阅读 · 2023年4月26日

The Nonlocal Neural Operator: Universal Approximation

Arxiv

0+阅读 · 2023年4月26日

BO-ICP: Initialization of Iterative Closest Point Based on Bayesian Optimization

Arxiv

0+阅读 · 2023年4月25日

Directed Graph Embeddings in Pseudo-Riemannian Manifolds

Arxiv

12+阅读 · 2021年6月16日

GAN Inversion: A Survey

Arxiv

19+阅读 · 2021年1月14日

Overcoming Catastrophic Forgetting in Graph Neural Networks

Arxiv

14+阅读 · 2020年12月10日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Graph Convolutional Networks for Text Classification

Arxiv

11+阅读 · 2018年10月17日

HyperGCN: Hypergraph Convolutional Networks for Semi-Supervised Classification

HyperGCN: Hypergraph Convolutional Networks for Semi-Supervised Classification

Arxiv

13+阅读 · 2018年9月7日

相关基金

一类离散Hindmarsh-Rose模型的分支延拓

国家自然科学基金

0+阅读 · 2015年12月31日

带有噪声扰动的动力系统分支问题研究

国家自然科学基金

0+阅读 · 2015年12月31日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

Heisenberg群与Minkowski空间中的非线性椭圆方程

国家自然科学基金

0+阅读 · 2014年12月31日

约束集值优化问题的适定性研究及相关分析

国家自然科学基金

0+阅读 · 2013年12月31日

动力系统的可积、分支与嵌入流

国家自然科学基金

0+阅读 · 2012年12月31日

有理动力系统中的拓扑和拟共形几何

国家自然科学基金

1+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

复动力系统及其应用

国家自然科学基金

0+阅读 · 2011年12月31日

不可压Navier-Stokes方程的适定性与正则性研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员