影响学习率的扰动率:Flatter当地微型学校学习率表的一般插件 (Learning Rate Perturbation: A Generic Plugin of Learning Rate Schedule towards Flatter Local Minima) - 专知论文

会员服务 ·

0

Learning · 学习率 · 极小值 · 局部极小 · Performer ·

2022 年 8 月 25 日

Learning Rate Perturbation: A Generic Plugin of Learning Rate Schedule towards Flatter Local Minima

翻译：影响学习率的扰动率:Flatter当地微型学校学习率表的一般插件

Hengyu Liu,Qiang Fu,Lun Du,Tiancheng Zhang,Ge Yu,Shi Han,Dongmei Zhang

from arxiv, 7 pages, Accepted by CIKM'22

Learning rate is one of the most important hyper-parameters that has a significant influence on neural network training. Learning rate schedules are widely used in real practice to adjust the learning rate according to pre-defined schedules for fast convergence and good generalization. However, existing learning rate schedules are all heuristic algorithms and lack theoretical support. Therefore, people usually choose the learning rate schedules through multiple ad-hoc trials, and the obtained learning rate schedules are sub-optimal. To boost the performance of the obtained sub-optimal learning rate schedule, we propose a generic learning rate schedule plugin, called LEArning Rate Perturbation (LEAP), which can be applied to various learning rate schedules to improve the model training by introducing a certain perturbation to the learning rate. We found that, with such a simple yet effective strategy, training processing exponentially favors flat minima rather than sharp minima with guaranteed convergence, which leads to better generalization ability. In addition, we conduct extensive experiments which show that training with LEAP can improve the performance of various deep learning models on diverse datasets using various learning rate schedules (including constant learning rate).

翻译：学习率表是最重要的超参数之一,对神经网络培训有重大影响。学习率表在实际实践中被广泛使用,以便根据预先确定的快速趋同和良好普及时间表调整学习率。但是,现有的学习率表都是超自然算法,缺乏理论支持。因此,人们通常通过多种特别试验选择学习率表,而获得的学习率表则不尽如人意。为了提高获得的亚最佳学习率表的绩效,我们提议了一个通用学习率表插件,称为LEAP,可应用于各种学习率表,通过引入某种对学习率的干扰来改进模式培训。我们发现,通过这种简单有效的战略,培训处理指数性偏重于平流微型马,而不是精细微型马,保证融合,从而导致更好的普及能力。此外,我们进行了广泛的实验,表明与LEAP的培训能够利用各种学习率表(包括固定学习率)改进不同数据集的各种深层次学习模式的绩效。

0

相关内容

Learning

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

β4GalT1调节TNFR1的糖基化修饰在小胶质细胞炎性激活中的作用

国家自然科学基金

0+阅读 · 2015年12月31日

集群环境下内存空间数据库管理与查询技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

免疫调控蛋白ABIN1抑制TNF诱导细胞凋亡的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

激酶Pak2调控Caspase-3抑制细胞凋亡的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

DEC1、DEC2对人乳腺癌细胞衰老的调控作用及其作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

肿瘤细胞中凋亡抑制蛋白CFLAR乙酰化调控的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

小半夏加茯苓汤诱导肿瘤细胞凋亡途径及其机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Dirichlet空间的分析与几何

国家自然科学基金

0+阅读 · 2011年12月31日

Narf影响细胞衰老的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

DNA损伤诱导的p53非依赖性细胞凋亡途径- - -Bim途径

国家自然科学基金

0+阅读 · 2009年12月31日

Toward a Geometrical Understanding of Self-supervised Contrastive Learning

Toward a Geometrical Understanding of Self-supervised Contrastive Learning

Arxiv

0+阅读 · 2022年10月6日

Federated Learning with Server Learning: Enhancing Performance for Non-IID Data

Arxiv

0+阅读 · 2022年10月6日

Conservative Evolution of Black Hole Perturbations with Time-Symmetric Numerical Methods

Arxiv

0+阅读 · 2022年10月5日

Dynamical Isometry for Residual Networks

Arxiv

0+阅读 · 2022年10月5日

The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima

Arxiv

0+阅读 · 2022年10月4日

Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL

Arxiv

0+阅读 · 2022年10月4日

Towards Automatic Forecasting: Evaluation of Time-Series Forecasting Models for Chickenpox Cases Estimation in Hungary

Arxiv

0+阅读 · 2022年10月4日

Deep Learning for Wireless Networked Systems: a joint Estimation-Control-Scheduling Approach

Arxiv

0+阅读 · 2022年10月3日

LEAPER: Fast and Accurate FPGA-based System Performance Prediction via Transfer Learning

Arxiv

0+阅读 · 2022年10月2日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

【新书发布】原作者MarcG.Bellemare发布315页分布强化学习书籍(DistributionalRL)

深度强化学习实验室

1+阅读 · 2022年1月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Toward a Geometrical Understanding of Self-supervised Contrastive Learning

Toward a Geometrical Understanding of Self-supervised Contrastive Learning

Arxiv

0+阅读 · 2022年10月6日

Federated Learning with Server Learning: Enhancing Performance for Non-IID Data

Arxiv

0+阅读 · 2022年10月6日

Conservative Evolution of Black Hole Perturbations with Time-Symmetric Numerical Methods

Arxiv

0+阅读 · 2022年10月5日

Dynamical Isometry for Residual Networks

Arxiv

0+阅读 · 2022年10月5日

The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima

Arxiv

0+阅读 · 2022年10月4日

Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL

Arxiv

0+阅读 · 2022年10月4日

Towards Automatic Forecasting: Evaluation of Time-Series Forecasting Models for Chickenpox Cases Estimation in Hungary

Arxiv

0+阅读 · 2022年10月4日

Deep Learning for Wireless Networked Systems: a joint Estimation-Control-Scheduling Approach

Arxiv

0+阅读 · 2022年10月3日

LEAPER: Fast and Accurate FPGA-based System Performance Prediction via Transfer Learning

Arxiv

0+阅读 · 2022年10月2日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

相关基金

β4GalT1调节TNFR1的糖基化修饰在小胶质细胞炎性激活中的作用

国家自然科学基金

0+阅读 · 2015年12月31日

集群环境下内存空间数据库管理与查询技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

免疫调控蛋白ABIN1抑制TNF诱导细胞凋亡的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

激酶Pak2调控Caspase-3抑制细胞凋亡的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

DEC1、DEC2对人乳腺癌细胞衰老的调控作用及其作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

肿瘤细胞中凋亡抑制蛋白CFLAR乙酰化调控的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

小半夏加茯苓汤诱导肿瘤细胞凋亡途径及其机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Dirichlet空间的分析与几何

国家自然科学基金

0+阅读 · 2011年12月31日

Narf影响细胞衰老的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

DNA损伤诱导的p53非依赖性细胞凋亡途径- - -Bim途径

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员