Eigencurve: SGD 最佳学习率计划,关于具有偏斜黑森光谱的二次曲线目标的 SGD 最佳学习率计划 (Eigencurve: Optimal Learning Rate Schedule for SGD on Quadratic Objectives with Skewed Hessian Spectrums) - 专知论文

会员服务 ·

0

Learning · 余弦衰减 · 学习率 · SGD · 优化器 ·

2022 年 6 月 14 日

Eigencurve: Optimal Learning Rate Schedule for SGD on Quadratic Objectives with Skewed Hessian Spectrums

翻译：Eigencurve: SGD 最佳学习率计划,关于具有偏斜黑森光谱的二次曲线目标的 SGD 最佳学习率计划

Rui Pan,Haishan Ye,Tong Zhang

from arxiv, Published at ICLR 2022

Learning rate schedulers have been widely adopted in training deep neural networks. Despite their practical importance, there is a discrepancy between its practice and its theoretical analysis. For instance, it is not known what schedules of SGD achieve best convergence, even for simple problems such as optimizing quadratic objectives. In this paper, we propose Eigencurve, the first family of learning rate schedules that can achieve minimax optimal convergence rates (up to a constant) for SGD on quadratic objectives when the eigenvalue distribution of the underlying Hessian matrix is skewed. The condition is quite common in practice. Experimental results show that Eigencurve can significantly outperform step decay in image classification tasks on CIFAR-10, especially when the number of epochs is small. Moreover, the theory inspires two simple learning rate schedulers for practical applications that can approximate eigencurve. For some problems, the optimal shape of the proposed schedulers resembles that of cosine decay, which sheds light to the success of cosine decay for such situations. For other situations, the proposed schedulers are superior to cosine decay.

翻译：培养深神经网络时广泛采用了学习率表。尽管其实践和理论分析之间有着实际重要性的差异, 但实践和理论分析之间存在差异。例如, 尚不清楚SGD的进度表能取得最佳趋同, 即使是在优化二次目标等简单问题上也是如此。在本文中, 我们提议, Eigencurve 是SGD在二次目标上的第一个学习率表系列, 能够达到最小最大最佳趋同率( 直至恒定值 ) 。对于一些问题, 拟议的进度表的优化形状类似于Cosine 衰变, 这在实际中相当常见。实验结果显示, Eigencurve 大大超过CIFAR- 10 图像分类任务( CIFAR- 10 10 ) 的步衰减速度, 特别是当小于小粒子时。此外, 理论激励了两种简单的学习率表, 其实际应用可以接近乙金质。对于一些问题,, 拟议的进度表的形状和Cosine 衰败情况相似, 。

0

相关内容

Learning

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

AlGaN极化场调控对内量子效率的影响

国家自然科学基金

1+阅读 · 2016年12月31日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

新疆天山北坡经济带PM_2.5时空分布与LUCC的关联性研究

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于Junction tree推理的多运动平台分散式协同导航算法研究

国家自然科学基金

2+阅读 · 2012年12月31日

低热导率三元稀土硫族化合物的热电特性调制

国家自然科学基金

0+阅读 · 2012年12月31日

高压直流输电线路磁悬浮巡检机器人的关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

集成化高性能微纳机电射频谐振器件研究

国家自然科学基金

0+阅读 · 2011年12月31日

窄能隙给体聚合物能级的调节与高LUMO能级富勒烯受体光伏特性研究

国家自然科学基金

0+阅读 · 2011年12月31日

Deep Historical Borrowing Framework to Prospectively and Simultaneously Synthesize Control Information in Confirmatory Clinical Trials with Multiple Endpoints

Arxiv

0+阅读 · 2022年8月2日

Numerical identification of initial temperatures in heat equation with dynamic boundary conditions

Arxiv

0+阅读 · 2022年8月1日

Efficient Personalized Learning for Wearable Health Applications using HyperDimensional Computing

Arxiv

0+阅读 · 2022年8月1日

On the Power-Law Hessian Spectrums in Deep Learning

Arxiv

0+阅读 · 2022年8月1日

Performance Comparison of Deep RL Algorithms for Energy Systems Optimal Scheduling

Arxiv

0+阅读 · 2022年8月1日

Bayesian Active Learning for Sim-to-Real Robotic Perception

Arxiv

0+阅读 · 2022年8月1日

A Modified Union Bound on Symbol Error Probability for Fading Channels

Arxiv

0+阅读 · 2022年7月29日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

A model robust sub-sampling approach for Generalised Linear Models in Big data settings

Arxiv

0+阅读 · 2022年7月29日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Arxiv

10+阅读 · 2018年1月20日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Deep Historical Borrowing Framework to Prospectively and Simultaneously Synthesize Control Information in Confirmatory Clinical Trials with Multiple Endpoints

Arxiv

0+阅读 · 2022年8月2日

Numerical identification of initial temperatures in heat equation with dynamic boundary conditions

Arxiv

0+阅读 · 2022年8月1日

Efficient Personalized Learning for Wearable Health Applications using HyperDimensional Computing

Arxiv

0+阅读 · 2022年8月1日

On the Power-Law Hessian Spectrums in Deep Learning

Arxiv

0+阅读 · 2022年8月1日

Performance Comparison of Deep RL Algorithms for Energy Systems Optimal Scheduling

Arxiv

0+阅读 · 2022年8月1日

Bayesian Active Learning for Sim-to-Real Robotic Perception

Arxiv

0+阅读 · 2022年8月1日

A Modified Union Bound on Symbol Error Probability for Fading Channels

Arxiv

0+阅读 · 2022年7月29日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

A model robust sub-sampling approach for Generalised Linear Models in Big data settings

Arxiv

0+阅读 · 2022年7月29日

IEOPF: An Active Contour Model for Image Segmentation with Inhomogeneities Estimated by Orthogonal Primary Functions

Arxiv

10+阅读 · 2018年1月20日

相关基金

AlGaN极化场调控对内量子效率的影响

国家自然科学基金

1+阅读 · 2016年12月31日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

新疆天山北坡经济带PM_2.5时空分布与LUCC的关联性研究

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于Junction tree推理的多运动平台分散式协同导航算法研究

国家自然科学基金

2+阅读 · 2012年12月31日

低热导率三元稀土硫族化合物的热电特性调制

国家自然科学基金

0+阅读 · 2012年12月31日

高压直流输电线路磁悬浮巡检机器人的关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

集成化高性能微纳机电射频谐振器件研究

国家自然科学基金

0+阅读 · 2011年12月31日

窄能隙给体聚合物能级的调节与高LUMO能级富勒烯受体光伏特性研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员