关于不趋同的学习算法的概括化 (On the generalization of learning algorithms that do not converge) - 专知论文

会员服务 ·

0

泛化理论 · Learning · Networking · Weight · 可约的 ·

2022 年 8 月 19 日

On the generalization of learning algorithms that do not converge

翻译：关于不趋同的学习算法的概括化

Nisha Chandramoorthy,Andreas Loukas,Khashayar Gatmiry,Stefanie Jegelka

from arxiv, 27 pages, under review

Generalization analyses of deep learning typically assume that the training converges to a fixed point. But, recent results indicate that in practice, the weights of deep neural networks optimized with stochastic gradient descent often oscillate indefinitely. To reduce this discrepancy between theory and practice, this paper focuses on the generalization of neural networks whose training dynamics do not necessarily converge to fixed points. Our main contribution is to propose a notion of statistical algorithmic stability (SAS) that extends classical algorithmic stability to non-convergent algorithms and to study its connection to generalization. This ergodic-theoretic approach leads to new insights when compared to the traditional optimization and learning theory perspectives. We prove that the stability of the time-asymptotic behavior of a learning algorithm relates to its generalization and empirically demonstrate how loss dynamics can provide clues to generalization performance. Our findings provide evidence that networks that "train stably generalize better" even when the training continues indefinitely and the weights do not converge.

翻译：深层学习的概括分析通常假定培训会达到一个固定点。但是,最近的结果显示,在实践上,以随机梯度梯度下降优化的深神经网络的重量往往会无限期地悬浮。为了缩小理论与实践之间的这种差异,本文件侧重于对神经网络的概括化,这些神经网络的培训动态不一定会与固定点趋同。我们的主要贡献是提出统计算法稳定性的概念(SAS),将古典算法稳定性扩大到非一致算法,并研究其与一般化的联系。在与传统的优化和学习理论观点相比较时,这种ergodic-神学方法导致新的洞察力。我们证明,学习算法的时间-无损行为的稳定与它的一般化和实验性表明损失动态如何能为概括性表现提供线索。我们的调查结果证明,即使培训继续无限期,重量也不趋同,但“刀拔刀拔刀扎刀”的网络也“会更好”。

0

相关内容

泛化理论

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

《JADC2 Update—— The What to the How》美国国防信息系统局（DISA）10页slides

《JADC2 Update—— The What to the How》美国国防信息系统局（DISA）10页slides

专知会员服务

49+阅读 · 2022年6月8日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

针电极下大气压脉冲放电模式转换机制模拟研究

国家自然科学基金

0+阅读 · 2015年12月31日

胶质瘤侵袭过程中DNMT1沉默miR-134与ERK信号通路自激活的表观新机制

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

一类荧光增强型重金属离子传感器的构筑及性质研究

国家自然科学基金

0+阅读 · 2012年12月31日

闪电放电等离子体的传输特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

宿主蛋白Rab家族在IFITMs抑制病毒复制中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-140在肿瘤转移中的作用及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Fermi伽玛射线脉冲星及脉冲星风云的高能物理特性

国家自然科学基金

0+阅读 · 2011年12月31日

等离子体助离子液体中可磁分离TiO2形成机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

Narf影响细胞衰老的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

A Game-Theoretic Perspective of Generalization in Reinforcement Learning

Arxiv

0+阅读 · 2022年10月6日

Non-Convergence and Limit Cycles in the Adam optimizer

Arxiv

0+阅读 · 2022年10月5日

The Variational Method of Moments

Arxiv

0+阅读 · 2022年10月4日

On Stability and Generalization of Bilevel Optimization Problem

Arxiv

0+阅读 · 2022年10月3日

Stability Analysis and Generalization Bounds of Adversarial Training

Arxiv

0+阅读 · 2022年10月3日

The Final Ascent: When Bigger Models Generalize Worse on Noisy-Labeled Data

Arxiv

0+阅读 · 2022年10月3日

Deep Intrinsically Motivated Exploration in Continuous Control

Arxiv

0+阅读 · 2022年10月1日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Arxiv

16+阅读 · 2020年3月12日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

《JADC2 Update—— The What to the How》美国国防信息系统局（DISA）10页slides

《JADC2 Update—— The What to the How》美国国防信息系统局（DISA）10页slides

专知会员服务

49+阅读 · 2022年6月8日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

不可错过！UIUC最新《统计强化学习》课程！

专知会员服务

53+阅读 · 2020年9月7日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

A Game-Theoretic Perspective of Generalization in Reinforcement Learning

Arxiv

0+阅读 · 2022年10月6日

Non-Convergence and Limit Cycles in the Adam optimizer

Arxiv

0+阅读 · 2022年10月5日

The Variational Method of Moments

Arxiv

0+阅读 · 2022年10月4日

On Stability and Generalization of Bilevel Optimization Problem

Arxiv

0+阅读 · 2022年10月3日

Stability Analysis and Generalization Bounds of Adversarial Training

Arxiv

0+阅读 · 2022年10月3日

The Final Ascent: When Bigger Models Generalize Worse on Noisy-Labeled Data

Arxiv

0+阅读 · 2022年10月3日

Deep Intrinsically Motivated Exploration in Continuous Control

Arxiv

0+阅读 · 2022年10月1日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Arxiv

16+阅读 · 2020年3月12日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

相关基金

针电极下大气压脉冲放电模式转换机制模拟研究

国家自然科学基金

0+阅读 · 2015年12月31日

胶质瘤侵袭过程中DNMT1沉默miR-134与ERK信号通路自激活的表观新机制

国家自然科学基金

0+阅读 · 2015年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

一类荧光增强型重金属离子传感器的构筑及性质研究

国家自然科学基金

0+阅读 · 2012年12月31日

闪电放电等离子体的传输特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

宿主蛋白Rab家族在IFITMs抑制病毒复制中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-140在肿瘤转移中的作用及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Fermi伽玛射线脉冲星及脉冲星风云的高能物理特性

国家自然科学基金

0+阅读 · 2011年12月31日

等离子体助离子液体中可磁分离TiO2形成机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

Narf影响细胞衰老的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员