可被证实的具有庞大的周期性学习率的超级说服力 (Provable Super-Convergence with a Large Cyclical Learning Rate) - 专知论文

会员服务 ·

0

循环学习率 · 学习率 · 学成 · CLR · FAST ·

2021 年 8 月 10 日

Provable Super-Convergence with a Large Cyclical Learning Rate

翻译：可被证实的具有庞大的周期性学习率的超级说服力

Conventional wisdom dictates that learning rate should be in the stable regime so that gradient-based algorithms don't blow up. This letter introduces a simple scenario where an unstably large learning rate scheme leads to a super fast convergence, with the convergence rate depending only logarithmically on the condition number of the problem. Our scheme uses a Cyclical Learning Rate (CLR) where we periodically take one large unstable step and several small stable steps to compensate for the instability. These findings also help explain the empirical observations of [Smith and Topin, 2019] where they show that CLR with a large maximum learning rate can dramatically accelerate learning and lead to so-called "super-convergence". We prove that our scheme excels in the problems where Hessian exhibits a bimodal spectrum and the eigenvalues can be grouped into two clusters (small and large). The unstably large step is the key to enabling fast convergence over the small eigen-spectrum.

翻译：常规智慧要求学习率应该在稳定的制度下, 以便梯度算法不会爆炸。这封信引入了一个简单的假设: 一个无法想象的大型学习率计划会导致超快趋同, 融合率只取决于问题的条件数量。我们的计划使用一个环球学习率(CLR ), 我们定期采取一个大的不稳定步骤和几个小的稳定步骤来补偿不稳定。这些结果也有助于解释[ Smith and Topin, 2019] 的经验性观察, 其中显示, 高学习率的CLR能够大大加速学习, 并导致所谓的“超级融合 ” 。我们证明, 我们的计划在赫森展示双模式频谱和亚值的问题中非常出色, 可以分为两大类( 大小 ) 。不可想象的大步是让小类类生物快速趋同的关键。

0

相关内容

循环学习率

循环学习率

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

专知会员服务

84+阅读 · 2020年11月25日

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

专知会员服务

66+阅读 · 2020年8月22日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

专知会员服务

77+阅读 · 2020年6月28日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【电子书】人工智能编程范式（Paradigms of Artificial Intelligence Programming）1048页PDF免费下载

【电子书】人工智能编程范式（Paradigms of Artificial Intelligence Programming）1048页PDF免费下载

专知会员服务

50+阅读 · 2019年10月30日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

已删除

将门创投

7+阅读 · 2018年10月12日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Provable Representation Learning for Imitation with Contrastive Fourier Features

Arxiv

0+阅读 · 2021年10月8日

Accelerated Gradient Descent Learning over Multiple Access Fading Channels

Accelerated Gradient Descent Learning over Multiple Access Fading Channels

Arxiv

0+阅读 · 2021年10月8日

Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect

Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect

Arxiv

0+阅读 · 2021年10月7日

Meta Internal Learning

Arxiv

0+阅读 · 2021年10月6日

Improving Generalization of Deep Reinforcement Learning-based TSP Solvers

Improving Generalization of Deep Reinforcement Learning-based TSP Solvers

Arxiv

0+阅读 · 2021年10月6日

Efficient learning methods for large-scale optimal inversion design

Arxiv

0+阅读 · 2021年10月6日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

Diffusion Improves Graph Learning

Arxiv

6+阅读 · 2019年11月14日

A Meta-Learning Framework for Generalized Zero-Shot Learning

A Meta-Learning Framework for Generalized Zero-Shot Learning

Arxiv

3+阅读 · 2019年9月10日

Accelerated Reinforcement Learning

Arxiv

6+阅读 · 2018年4月24日

VIP会员

文章信息

相关主题

循环学习率

相关VIP内容

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

专知会员服务

84+阅读 · 2020年11月25日

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

最新《模仿学习 - Imitation Learning》教程，63页ppt，微软Kamil Ciosek

专知会员服务

66+阅读 · 2020年8月22日

【MIT】反偏差对比学习，Debiased Contrastive Learning

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

【ICML2020】拉普拉斯正则化小样本学习，Laplacian Regularized Few-Shot Learning

专知会员服务

77+阅读 · 2020年6月28日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【电子书】人工智能编程范式（Paradigms of Artificial Intelligence Programming）1048页PDF免费下载

【电子书】人工智能编程范式（Paradigms of Artificial Intelligence Programming）1048页PDF免费下载

专知会员服务

50+阅读 · 2019年10月30日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

赋能真实世界：基于大语言模型的产业智能体技术、实践与评测综述

军事行动中人工智能系统目标交战的附带损伤评估模型 | 最新文献

【普林斯顿博士论文】面向人本机器人学的安全与学习博弈论融合

美陆军协会（AUSA）2025 年会公布的美国十大武器与防务产品创新

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

已删除

将门创投

7+阅读 · 2018年10月12日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Provable Representation Learning for Imitation with Contrastive Fourier Features

Arxiv

0+阅读 · 2021年10月8日

Accelerated Gradient Descent Learning over Multiple Access Fading Channels

Accelerated Gradient Descent Learning over Multiple Access Fading Channels

Arxiv

0+阅读 · 2021年10月8日

Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect

Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect

Arxiv

0+阅读 · 2021年10月7日

Meta Internal Learning

Arxiv

0+阅读 · 2021年10月6日

Improving Generalization of Deep Reinforcement Learning-based TSP Solvers

Improving Generalization of Deep Reinforcement Learning-based TSP Solvers

Arxiv

0+阅读 · 2021年10月6日

Efficient learning methods for large-scale optimal inversion design

Arxiv

0+阅读 · 2021年10月6日

A Modern Introduction to Online Learning

A Modern Introduction to Online Learning

Arxiv

21+阅读 · 2019年12月31日

Diffusion Improves Graph Learning

Arxiv

6+阅读 · 2019年11月14日

A Meta-Learning Framework for Generalized Zero-Shot Learning

A Meta-Learning Framework for Generalized Zero-Shot Learning

Arxiv

3+阅读 · 2019年9月10日

Accelerated Reinforcement Learning

Arxiv

6+阅读 · 2018年4月24日

微信扫码咨询专知VIP会员