Google 发布82页《深度学习泛化性揭秘》综述论文，On the Generalization Mystery in Deep Learning - 专知VIP

会员服务 ·

24

深度学习泛化性 · Google · 综述论文 ·

2022 年 3 月 22 日

Google 发布82页《深度学习泛化性揭秘》综述论文，On the Generalization Mystery in Deep Learning

专知会员服务

专知，提供专业可信的知识分发服务，让认知协作更快更好！

深度学习中的泛化神秘在于:为什么经过梯度下降(GD)训练的过参数化神经网络能够很好地对真实数据集进行泛化，即使它们能够拟合具有可比性的随机数据集?此外，在符合训练数据的所有解决方案中，GD如何找到一个泛化良好的解决方案(当存在这样一个泛化良好的解决方案时)?

我们认为，这两个问题的答案都在于训练过程中不同例子的梯度之间的交互作用。直观地说，如果每个示例的梯度是良好对齐的，也就是说，如果它们是一致的，那么可以期望GD(在算法上)是稳定的，因此可以很好地泛化。我们用一个易于计算和解释的一致性度量来形式化这个论点，并表明对于几个常见的视觉网络，度量在真实和随机数据集上具有非常不同的值。该理论还解释了深度学习中的一些其他现象，比如为什么一些例子比其他例子更早被可靠地学习，为什么早停止有用，为什么可以从嘈杂的标签中学习。由于该理论提供了一个因果解释，解释了GD如何在存在的情况下找到一个很好的泛化解决方案，它激发了对GD的一系列简单的修改，减少了记忆，提高了泛化。

在深度学习中，泛化是一个极其广泛的现象，因此，它需要一个同样普遍的解释。最后，我们对解决这一问题的其他途径进行了综述，并认为所建议的方法是在此基础上最可行的方法。

成为VIP会员查看完整内容

61

相关内容

深度学习泛化性

深度学习泛化性

为什么深度学习泛化性好？Google发布82页《深度学习泛化性揭秘》论文提出相干性梯度理论来解释

为什么深度学习泛化性好？Google发布82页《深度学习泛化性揭秘》论文提出相干性梯度理论来解释

专知会员服务

64+阅读 · 2022年3月23日

深度学习为何泛化好？CMU博士论文《解释深度学习中的泛化性》探究深度学习泛化性的理论基础进展

深度学习为何泛化好？CMU博士论文《解释深度学习中的泛化性》探究深度学习泛化性的理论基础进展

专知会员服务

84+阅读 · 2021年10月22日

最新《计算机视觉领域泛化Domain Generalization》综述论文，18页pdf229篇文献

专知会员服务

58+阅读 · 2021年7月27日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

140+阅读 · 2020年7月10日

最新《贝叶斯深度学习》综述论文，35页pdf，A Survey on Bayesian Deep Learning

最新《贝叶斯深度学习》综述论文，35页pdf，A Survey on Bayesian Deep Learning

专知会员服务

209+阅读 · 2020年7月5日

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

专知会员服务

68+阅读 · 2020年4月28日

【重磅】Google元老Eric Schmidt发布《深度学习2020大综述》，48页pdf

【重磅】Google元老Eric Schmidt发布《深度学习2020大综述》，48页pdf

专知会员服务

154+阅读 · 2020年3月27日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【NeurlPS2019论文总结】一致收敛可能无法解释深度学习中的泛化现象，Uniform convergence may be unable to explain generalization in deep learning

【NeurlPS2019论文总结】一致收敛可能无法解释深度学习中的泛化现象，Uniform convergence may be unable to explain generalization in deep learning

专知会员服务

15+阅读 · 2019年12月17日

【NeurIPS2019|杰出新方向论文奖】统一收敛可能无法解释深度学习中的泛化性（Uniform convergence maybe unable to explain generalization in deep learning）

【NeurIPS2019|杰出新方向论文奖】统一收敛可能无法解释深度学习中的泛化性（Uniform convergence maybe unable to explain generalization in deep learning）

专知会员服务

13+阅读 · 2019年12月9日

为什么深度学习泛化性好？Google发布82页《深度学习泛化性揭秘》论文提出相干性梯度理论来解释

为什么深度学习泛化性好？Google发布82页《深度学习泛化性揭秘》论文提出相干性梯度理论来解释

专知

0+阅读 · 2022年3月23日

神经网络为何越大越好？这篇NeurIPS论文证明：鲁棒性是泛化的基础

神经网络为何越大越好？这篇NeurIPS论文证明：鲁棒性是泛化的基础

新智元

2+阅读 · 2022年2月22日

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

专知

11+阅读 · 2021年4月23日

「深度学习:一种统计视角」，伯克利&斯坦福89页pdf综述论文

「深度学习:一种统计视角」，伯克利&斯坦福89页pdf综述论文

专知

0+阅读 · 2021年3月20日

最新《迁移学习:域自适应理论》综述论文，128页ppt讲解迁移学习与最优传输

最新《迁移学习:域自适应理论》综述论文，128页ppt讲解迁移学习与最优传输

专知

16+阅读 · 2020年4月27日

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

专知

20+阅读 · 2020年2月28日

元学习—Meta Learning的兴起

元学习—Meta Learning的兴起

专知

44+阅读 · 2019年10月19日

博客 | 度量学习总结(三) | Deep Metric Learning for Sequential Data

博客 | 度量学习总结(三) | Deep Metric Learning for Sequential Data

AI研习社

27+阅读 · 2019年4月13日

《小样本学习(Few-shot learning)》最新41页综述论文，来自港科大和第四范式

《小样本学习(Few-shot learning)》最新41页综述论文，来自港科大和第四范式

专知

363+阅读 · 2019年4月12日

【下载】深度学习DL4j实战指南《Deep Learning—A Practitioner's Approach》

【下载】深度学习DL4j实战指南《Deep Learning—A Practitioner's Approach》

专知

48+阅读 · 2017年12月9日

面向异分布数据的主动学习方法

国家自然科学基金

12+阅读 · 2015年12月31日

脉冲耦合神经网络中的可靠消息传播机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

部分遮挡下人脸人耳融合识别方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

结构健康监测的鲁棒性贝叶斯压缩采样和损伤识别方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于数据分布评估和支持向量机方法的分布式数据流挖掘模型和算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

环境参数对激光水中传输衰减特性的影响

国家自然科学基金

0+阅读 · 2012年12月31日

基于稀疏编码模型的深层学习神经网络

国家自然科学基金

7+阅读 · 2012年12月31日

基于Tikhonov正则化的多维题目因素分析方法

国家自然科学基金

0+阅读 · 2011年12月31日

临街建筑物内、外的道路交通噪声动态模拟研究

国家自然科学基金

0+阅读 · 2011年12月31日

三维人脸不变性特征提取及识别算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

A survey on improving NLP models with human explanations

Arxiv

0+阅读 · 2022年4月19日

Learning Convolutional Neural Networks in the Frequency Domain

Arxiv

0+阅读 · 2022年4月19日

Large-Scale Deep Learning Optimizations: A Comprehensive Survey

Arxiv

23+阅读 · 2021年11月2日

A Comprehensive Survey and Performance Analysis of Activation Functions in Deep Learning

A Comprehensive Survey and Performance Analysis of Activation Functions in Deep Learning

Arxiv

23+阅读 · 2021年9月29日

Attention, please! A survey of Neural Attention Models in Deep Learning

Arxiv

59+阅读 · 2021年3月31日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Few-shot Learning: A Survey

Few-shot Learning: A Survey

Arxiv

363+阅读 · 2019年4月10日

A Survey on Deep Transfer Learning

A Survey on Deep Transfer Learning

Arxiv

11+阅读 · 2018年8月6日

How convolutional neural network see the world - A survey of convolutional neural network visualization methods

Arxiv

11+阅读 · 2018年4月30日

VIP会员

相关主题

深度学习泛化性

相关VIP内容

为什么深度学习泛化性好？Google发布82页《深度学习泛化性揭秘》论文提出相干性梯度理论来解释

为什么深度学习泛化性好？Google发布82页《深度学习泛化性揭秘》论文提出相干性梯度理论来解释

专知会员服务

64+阅读 · 2022年3月23日

深度学习为何泛化好？CMU博士论文《解释深度学习中的泛化性》探究深度学习泛化性的理论基础进展

深度学习为何泛化好？CMU博士论文《解释深度学习中的泛化性》探究深度学习泛化性的理论基础进展

专知会员服务

84+阅读 · 2021年10月22日

最新《计算机视觉领域泛化Domain Generalization》综述论文，18页pdf229篇文献

专知会员服务

58+阅读 · 2021年7月27日

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

140+阅读 · 2020年7月10日

最新《贝叶斯深度学习》综述论文，35页pdf，A Survey on Bayesian Deep Learning

最新《贝叶斯深度学习》综述论文，35页pdf，A Survey on Bayesian Deep Learning

专知会员服务

209+阅读 · 2020年7月5日

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

专知会员服务

68+阅读 · 2020年4月28日

【重磅】Google元老Eric Schmidt发布《深度学习2020大综述》，48页pdf

【重磅】Google元老Eric Schmidt发布《深度学习2020大综述》，48页pdf

专知会员服务

154+阅读 · 2020年3月27日

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

论深度学习的信息瓶颈理论（On the information bottleneck theory of deep learning）

专知会员服务

66+阅读 · 2019年12月20日

【NeurlPS2019论文总结】一致收敛可能无法解释深度学习中的泛化现象，Uniform convergence may be unable to explain generalization in deep learning

【NeurlPS2019论文总结】一致收敛可能无法解释深度学习中的泛化现象，Uniform convergence may be unable to explain generalization in deep learning

专知会员服务

15+阅读 · 2019年12月17日

【NeurIPS2019|杰出新方向论文奖】统一收敛可能无法解释深度学习中的泛化性（Uniform convergence maybe unable to explain generalization in deep learning）

【NeurIPS2019|杰出新方向论文奖】统一收敛可能无法解释深度学习中的泛化性（Uniform convergence maybe unable to explain generalization in deep learning）

专知会员服务

13+阅读 · 2019年12月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

【斯坦福博士论文】数据、决策与依赖：构建可信人工智能的挑战

人工智能时代背景下的未来海战

接触战中的无人机优势：美军旅级部队面临的小型无人机系统挑战与调整

相关资讯

为什么深度学习泛化性好？Google发布82页《深度学习泛化性揭秘》论文提出相干性梯度理论来解释

为什么深度学习泛化性好？Google发布82页《深度学习泛化性揭秘》论文提出相干性梯度理论来解释

专知

0+阅读 · 2022年3月23日

神经网络为何越大越好？这篇NeurIPS论文证明：鲁棒性是泛化的基础

神经网络为何越大越好？这篇NeurIPS论文证明：鲁棒性是泛化的基础

新智元

2+阅读 · 2022年2月22日

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

最新最全《深度元学习》2021综述论文，68页pdf，A Survey of Deep Meta-Learning

专知

11+阅读 · 2021年4月23日

「深度学习:一种统计视角」，伯克利&斯坦福89页pdf综述论文

「深度学习:一种统计视角」，伯克利&斯坦福89页pdf综述论文

专知

0+阅读 · 2021年3月20日

最新《迁移学习:域自适应理论》综述论文，128页ppt讲解迁移学习与最优传输

最新《迁移学习:域自适应理论》综述论文，128页ppt讲解迁移学习与最优传输

专知

16+阅读 · 2020年4月27日

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

【香港科技大学】联邦半监督学习综述，A Survey on Federated Semi-supervised Learning

专知

20+阅读 · 2020年2月28日

元学习—Meta Learning的兴起

元学习—Meta Learning的兴起

专知

44+阅读 · 2019年10月19日

博客 | 度量学习总结(三) | Deep Metric Learning for Sequential Data

博客 | 度量学习总结(三) | Deep Metric Learning for Sequential Data

AI研习社

27+阅读 · 2019年4月13日

《小样本学习(Few-shot learning)》最新41页综述论文，来自港科大和第四范式

《小样本学习(Few-shot learning)》最新41页综述论文，来自港科大和第四范式

专知

363+阅读 · 2019年4月12日

【下载】深度学习DL4j实战指南《Deep Learning—A Practitioner's Approach》

【下载】深度学习DL4j实战指南《Deep Learning—A Practitioner's Approach》

专知

48+阅读 · 2017年12月9日

相关基金

面向异分布数据的主动学习方法

国家自然科学基金

12+阅读 · 2015年12月31日

脉冲耦合神经网络中的可靠消息传播机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

部分遮挡下人脸人耳融合识别方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

结构健康监测的鲁棒性贝叶斯压缩采样和损伤识别方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于数据分布评估和支持向量机方法的分布式数据流挖掘模型和算法研究

国家自然科学基金

1+阅读 · 2012年12月31日

环境参数对激光水中传输衰减特性的影响

国家自然科学基金

0+阅读 · 2012年12月31日

基于稀疏编码模型的深层学习神经网络

国家自然科学基金

7+阅读 · 2012年12月31日

基于Tikhonov正则化的多维题目因素分析方法

国家自然科学基金

0+阅读 · 2011年12月31日

临街建筑物内、外的道路交通噪声动态模拟研究

国家自然科学基金

0+阅读 · 2011年12月31日

三维人脸不变性特征提取及识别算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

相关论文

A survey on improving NLP models with human explanations

Arxiv

0+阅读 · 2022年4月19日

Learning Convolutional Neural Networks in the Frequency Domain

Arxiv

0+阅读 · 2022年4月19日

Large-Scale Deep Learning Optimizations: A Comprehensive Survey

Arxiv

23+阅读 · 2021年11月2日

A Comprehensive Survey and Performance Analysis of Activation Functions in Deep Learning

A Comprehensive Survey and Performance Analysis of Activation Functions in Deep Learning

Arxiv

23+阅读 · 2021年9月29日

Attention, please! A survey of Neural Attention Models in Deep Learning

Arxiv

59+阅读 · 2021年3月31日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

A Comprehensive Survey on Transfer Learning

A Comprehensive Survey on Transfer Learning

Arxiv

121+阅读 · 2019年11月7日

Few-shot Learning: A Survey

Few-shot Learning: A Survey

Arxiv

363+阅读 · 2019年4月10日

A Survey on Deep Transfer Learning

A Survey on Deep Transfer Learning

Arxiv

11+阅读 · 2018年8月6日

How convolutional neural network see the world - A survey of convolutional neural network visualization methods

Arxiv

11+阅读 · 2018年4月30日

微信扫码咨询专知VIP会员