具有高学习率的逐步后裔的特殊属性 (Special Properties of Gradient Descent with Large Learning Rates) - 专知论文

会员服务 ·

0

通用动力公司 · SGD · Learning · 情景 · 噪声 ·

2023 年 2 月 16 日

Special Properties of Gradient Descent with Large Learning Rates

翻译：具有高学习率的逐步后裔的特殊属性

Amirkeivan Mohtashami,Martin Jaggi,Sebastian Stich

from arxiv, A short version of this work appeared in ICML 22 ICML Workshop on Continuous Time Methods for Machine Learning under the title "The Gap Between Continuous and Discrete Gradient Descent"

When training neural networks, it has been widely observed that a large step size is essential in stochastic gradient descent (SGD) for obtaining superior models. However, the effect of large step sizes on the success of SGD is not well understood theoretically. Several previous works have attributed this success to the stochastic noise present in SGD. However, we show through a novel set of experiments that the stochastic noise is not sufficient to explain good non-convex training, and that instead the effect of a large learning rate itself is essential for obtaining best performance.We demonstrate the same effects also in the noise-less case, i.e. for full-batch GD. We formally prove that GD with large step size -- on certain non-convex function classes -- follows a different trajectory than GD with a small step size, which can lead to convergence to a global minimum instead of a local one. Our settings provide a framework for future analysis which allows comparing algorithms based on behaviors that can not be observed in the traditional settings.

翻译：当培训神经网络时,人们广泛认为,对于获得高级模型而言,大步级尺寸对于获得高级梯度底部(SGD)至关重要。然而,大步级大小对SGD成功的影响在理论上并没有得到很好的理解。前几部作品将这一成功归因于SGD中存在的随机噪音。然而,我们通过一系列新颖的实验表明,蒸汽噪音不足以解释良好的非凝固器培训,而高学习率本身对取得最佳性能至关重要。我们证明,在无噪音的情况下,即对于全批GD,也会产生同样的效果。我们正式证明,具有大步级大小的GD -- -- 在某些非convex功能类别上 -- -- 遵循的轨迹与小步级GD不同的轨迹,这可能导致全球最低程度的趋同,而不是局部的。我们的环境为今后的分析提供了一个框架,以便比较基于传统环境中无法观察到的行为的算法。

0

相关内容

通用动力公司

通用动力公司

通用动力公司（General Dynamics）是一家美国的国防企业集团。2008年时通用动力是世界第五大国防工业承包商。由于近年来不断的扩充和并购其他公司，通用动力现今的组成与面貌已与冷战时期时大不相同。现今通用动力包含三大业务集团：海洋、作战系统和资讯科技集团。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

多尺度随机双曲-抛物方程的约化

国家自然科学基金

0+阅读 · 2013年12月31日

多晶储锂正极材料的循环失效微观机理

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

中药飞龙掌血中磷酸二酯酶IV抑制剂的发现、结构优化及构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

晶粒细化、Al元素、Zr（Hf， Sc）微量元素协同效应对核结构材料ODS钢的耐腐蚀性和力学性能的优化及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

多铁性LSCMO/PMN-PT磁电复合薄膜的制备、表征及原型器件探索

国家自然科学基金

0+阅读 · 2012年12月31日

Notch信号对骨髓间充质干细胞移植治疗AD鼠的细胞转归与疗效调控

国家自然科学基金

0+阅读 · 2011年12月31日

冲击载荷下增韧陶瓷材料的增强增韧机理与表征

国家自然科学基金

0+阅读 · 2011年12月31日

模-相对Hochschild同调与上同调

国家自然科学基金

0+阅读 · 2011年12月31日

小麦族植物St、Y、P、H基因组间的演化关系研究

国家自然科学基金

0+阅读 · 2008年12月31日

A large deviation principle for the empirical measures of Metropolis-Hastings chains

Arxiv

0+阅读 · 2023年4月5日

A relaxed proximal gradient descent algorithm for convergent plug-and-play with proximal denoiser

Arxiv

0+阅读 · 2023年4月5日

On the Dynamics of First and Second Order GeCo and gBBKS Schemes

Arxiv

0+阅读 · 2023年4月3日

Learning Sparsity of Representations with Discrete Latent Variables

Arxiv

0+阅读 · 2023年4月3日

On the Concentration of the Minimizers of Empirical Risks

Arxiv

0+阅读 · 2023年4月3日

Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning

Arxiv

0+阅读 · 2023年4月3日

Convergence analysis of the Monte Carlo method for random Navier--Stokes--Fourier system

Arxiv

0+阅读 · 2023年4月2日

Coordinated Defense Allocation in Reach-Avoid Scenarios with Efficient Online Optimization

Arxiv

0+阅读 · 2023年4月1日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

VIP会员

文章信息

相关主题

通用动力公司

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

A large deviation principle for the empirical measures of Metropolis-Hastings chains

Arxiv

0+阅读 · 2023年4月5日

A relaxed proximal gradient descent algorithm for convergent plug-and-play with proximal denoiser

Arxiv

0+阅读 · 2023年4月5日

On the Dynamics of First and Second Order GeCo and gBBKS Schemes

Arxiv

0+阅读 · 2023年4月3日

Learning Sparsity of Representations with Discrete Latent Variables

Arxiv

0+阅读 · 2023年4月3日

On the Concentration of the Minimizers of Empirical Risks

Arxiv

0+阅读 · 2023年4月3日

Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning

Arxiv

0+阅读 · 2023年4月3日

Convergence analysis of the Monte Carlo method for random Navier--Stokes--Fourier system

Arxiv

0+阅读 · 2023年4月2日

Coordinated Defense Allocation in Reach-Avoid Scenarios with Efficient Online Optimization

Arxiv

0+阅读 · 2023年4月1日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

相关基金

多尺度随机双曲-抛物方程的约化

国家自然科学基金

0+阅读 · 2013年12月31日

多晶储锂正极材料的循环失效微观机理

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

中药飞龙掌血中磷酸二酯酶IV抑制剂的发现、结构优化及构效关系研究

国家自然科学基金

0+阅读 · 2013年12月31日

晶粒细化、Al元素、Zr（Hf， Sc）微量元素协同效应对核结构材料ODS钢的耐腐蚀性和力学性能的优化及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

多铁性LSCMO/PMN-PT磁电复合薄膜的制备、表征及原型器件探索

国家自然科学基金

0+阅读 · 2012年12月31日

Notch信号对骨髓间充质干细胞移植治疗AD鼠的细胞转归与疗效调控

国家自然科学基金

0+阅读 · 2011年12月31日

冲击载荷下增韧陶瓷材料的增强增韧机理与表征

国家自然科学基金

0+阅读 · 2011年12月31日

模-相对Hochschild同调与上同调

国家自然科学基金

0+阅读 · 2011年12月31日

小麦族植物St、Y、P、H基因组间的演化关系研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员