深深学习培训结束阶段的神经衰损率 (Prevalence of Neural Collapse during the terminal phase of deep learning training) - 专知论文

会员服务 ·

0

Better · DeepNet · 单纯形 · 训练误差 · 再缩放 ·

2020 年 8 月 21 日

Prevalence of Neural Collapse during the terminal phase of deep learning training

翻译：深深学习培训结束阶段的神经衰损率

Vardan Papyan,X. Y. Han,David L. Donoho

Modern practice for training classification deepnets involves a Terminal Phase of Training (TPT), which begins at the epoch where training error first vanishes; During TPT, the training error stays effectively zero while training loss is pushed towards zero. Direct measurements of TPT, for three prototypical deepnet architectures and across seven canonical classification datasets, expose a pervasive inductive bias we call Neural Collapse, involving four deeply interconnected phenomena: (NC1) Cross-example within-class variability of last-layer training activations collapses to zero, as the individual activations themselves collapse to their class-means; (NC2) The class-means collapse to the vertices of a Simplex Equiangular Tight Frame (ETF); (NC3) Up to rescaling, the last-layer classifiers collapse to the class-means, or in other words to the Simplex ETF, i.e. to a self-dual configuration; (NC4) For a given activation, the classifier's decision collapses to simply choosing whichever class has the closest train class-mean, i.e. the Nearest Class Center (NCC) decision rule. The symmetric and very simple geometry induced by the TPT confers important benefits, including better generalization performance, better robustness, and better interpretability.

翻译：(NC1)培训深网培训的现代做法涉及培训的结束阶段,从培训错误首先消失的时代开始; 在培训失败的时代,培训错误实际上保持零,而培训损失则推向零。 TPT直接测量了三种原型深网结构以及7个卡通分类数据集的TPT, 暴露了一种普遍的暗示偏差,我们称之为神经崩溃,涉及4个密切相互关联的现象:(NC1) 最后一层培训启动的跨级内变异性崩溃为零,因为个人激活本身会崩溃到他们的阶级; (NC2) 班级中利率向简单度角直角框架(ETF)的顶端倾斜;(NC3) 升至伸缩,最后一层分析器崩溃到阶级,或换句话说到简单 EtFTF, 即自相连接的配置;(NC4) 给定的升级决定崩溃,仅选择哪个班级最接近的班级的班级具有最接近的班级、更稳健性、 irodualcal cal decilal decilateal cal dequistrutal cal) 更好解释。

0

相关内容

Better

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

专知会员服务

18+阅读 · 2020年3月3日

【课程推荐】深度学习中的几何（Geometry of Deep Learning）

【课程推荐】深度学习中的几何（Geometry of Deep Learning）

专知会员服务

58+阅读 · 2019年11月10日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

35+阅读 · 2020年9月3日

Reverse Engineering Configurations of Neural Text Generation Models

Arxiv

5+阅读 · 2020年4月13日

On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

Arxiv

3+阅读 · 2019年3月27日

Representation Learning with Contrastive Predictive Coding

Arxiv

6+阅读 · 2019年1月22日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

Arxiv

3+阅读 · 2018年9月4日

Meta-Learning with Latent Embedding Optimization

Meta-Learning with Latent Embedding Optimization

Arxiv

6+阅读 · 2018年7月16日

ALMN: Deep Embedding Learning with Geometrical Virtual Point Generating

Arxiv

5+阅读 · 2018年6月5日

Generative Adversarial Autoencoder Networks

Arxiv

11+阅读 · 2018年3月23日

Denoising Adversarial Autoencoders

Arxiv

9+阅读 · 2018年1月4日

VIP会员

文章信息

相关主题

相关VIP内容

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

专知会员服务

18+阅读 · 2020年3月3日

【课程推荐】深度学习中的几何（Geometry of Deep Learning）

【课程推荐】深度学习中的几何（Geometry of Deep Learning）

专知会员服务

58+阅读 · 2019年11月10日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

美军最新条令《海军陆战队空地特遣队目标定位》128页

中文版 | 人工智能战争自适应集群平台（AI-WASP）项目获欧盟4500万欧元资助

《基于分层多智能体强化学习的空战战术优化研究》最新31页

《基于图神经网络与强化学习的自主空战决策研究》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Arxiv

35+阅读 · 2020年9月3日

Reverse Engineering Configurations of Neural Text Generation Models

Arxiv

5+阅读 · 2020年4月13日

On Attribution of Recurrent Neural Network Predictions via Additive Decomposition

Arxiv

3+阅读 · 2019年3月27日

Representation Learning with Contrastive Predictive Coding

Arxiv

6+阅读 · 2019年1月22日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

Arxiv

3+阅读 · 2018年9月4日

Meta-Learning with Latent Embedding Optimization

Meta-Learning with Latent Embedding Optimization

Arxiv

6+阅读 · 2018年7月16日

ALMN: Deep Embedding Learning with Geometrical Virtual Point Generating

Arxiv

5+阅读 · 2018年6月5日

Generative Adversarial Autoencoder Networks

Arxiv

11+阅读 · 2018年3月23日

Denoising Adversarial Autoencoders

Arxiv

9+阅读 · 2018年1月4日

微信扫码咨询专知VIP会员