模拟假设和探索-剥削学习率表 (Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule) - 专知论文

会员服务 ·

0

极小值 · 学习率 · 模型评估 · 学成 · 原点 ·

2021 年 6 月 1 日

Wide-minima Density Hypothesis and the Explore-Exploit Learning Rate Schedule

翻译：模拟假设和探索-剥削学习率表

Nikhil Iyer,V Thejas,Nipun Kwatra,Ramachandran Ramjee,Muthian Sivathanu

from arxiv, 34 pages

Several papers argue that wide minima generalize better than narrow minima. In this paper, through detailed experiments that not only corroborate the generalization properties of wide minima, we also provide empirical evidence for a new hypothesis that the density of wide minima is likely lower than the density of narrow minima. Further, motivated by this hypothesis, we design a novel explore-exploit learning rate schedule. On a variety of image and natural language datasets, compared to their original hand-tuned learning rate baselines, we show that our explore-exploit schedule can result in either up to 0.84% higher absolute accuracy using the original training budget or up to 57% reduced training time while achieving the original reported accuracy. For example, we achieve state-of-the-art (SOTA) accuracy for IWSLT'14 (DE-EN) dataset by just modifying the learning rate schedule of a high performing model.

翻译：有几篇论文认为,广义的微型研究比狭小的微型研究要简单得多。在本文中,通过不仅证实大微型研究一般特性的详细实验,我们还为一个新的假设提供了经验证据,即大微型研究的密度可能低于狭小微型研究的密度。此外,基于这一假设,我们设计了一个创新的探索-开发学习率时间表。与最初的手工调整学习率基线相比,我们的各种图像和自然语言数据集表明,我们的探索-开发时间表可以导致使用原始培训预算达到0.84%的绝对精确度,或者在达到最初报告的准确度的同时减少57%的培训时间。例如,我们通过仅仅修改高性能模型的学习率时间表,就实现了IWSLT'14(DE-EN)数据的最先进(SOTA)精确度。

0

相关内容

极小值

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

专知会员服务

84+阅读 · 2020年11月25日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

【ML热点】贝叶斯学习与深度学习如何结合？看这份《贝叶斯深度学习 Deep Learning with Bayesian Principles 》NeurIPS2019硬核教程

【ML热点】贝叶斯学习与深度学习如何结合？看这份《贝叶斯深度学习 Deep Learning with Bayesian Principles 》NeurIPS2019硬核教程

专知会员服务

117+阅读 · 2019年12月22日

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

专知会员服务

122+阅读 · 2019年11月24日

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

专知会员服务

118+阅读 · 2019年10月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Empirical Risk Minimization in the Interpolating Regime with Application to Neural Network Learning

Arxiv

0+阅读 · 2021年7月23日

Density Constrained Reinforcement Learning

Arxiv

6+阅读 · 2021年6月24日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Early-Learning Regularization Prevents Memorization of Noisy Labels

Early-Learning Regularization Prevents Memorization of Noisy Labels

Arxiv

3+阅读 · 2020年6月30日

Unsupervised Cross-lingual Representation Learning at Scale

Arxiv

5+阅读 · 2019年11月5日

Improving Collaborative Metric Learning with Efficient Negative Sampling

Arxiv

3+阅读 · 2019年9月24日

Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks

Arxiv

5+阅读 · 2019年8月27日

Learning Discriminative Motion Features Through Detection

Learning Discriminative Motion Features Through Detection

Arxiv

3+阅读 · 2018年12月11日

Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年9月6日

Learning to Update for Object Tracking

Arxiv

8+阅读 · 2018年6月19日

VIP会员

文章信息

相关主题

相关VIP内容

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

【UIUC】最新《自监督学习》教程，51页ppt，Self-supervised learning

专知会员服务

84+阅读 · 2020年11月25日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

【ML热点】贝叶斯学习与深度学习如何结合？看这份《贝叶斯深度学习 Deep Learning with Bayesian Principles 》NeurIPS2019硬核教程

【ML热点】贝叶斯学习与深度学习如何结合？看这份《贝叶斯深度学习 Deep Learning with Bayesian Principles 》NeurIPS2019硬核教程

专知会员服务

117+阅读 · 2019年12月22日

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

【微软Alekh等开放新书】强化学习理论与算法（Reinforcement Learning:Theory and Algorithms），附83页pdf

专知会员服务

122+阅读 · 2019年11月24日

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

新书分享：强化学习最新书稿《强化学习导论》（Reinforcement Learning An Introduction）第二版出炉

专知会员服务

118+阅读 · 2019年10月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人系统对关键海事基础设施的海上监视》报告

《利用低成本无人系统构建反潜战屏障的概念演示》报告

《无形的防御者？将定向能武器集成到反无人机框架的机遇与挑战》报告

《基于人工智能的多域作战任务调度系统》报告

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Empirical Risk Minimization in the Interpolating Regime with Application to Neural Network Learning

Arxiv

0+阅读 · 2021年7月23日

Density Constrained Reinforcement Learning

Arxiv

6+阅读 · 2021年6月24日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Early-Learning Regularization Prevents Memorization of Noisy Labels

Early-Learning Regularization Prevents Memorization of Noisy Labels

Arxiv

3+阅读 · 2020年6月30日

Unsupervised Cross-lingual Representation Learning at Scale

Arxiv

5+阅读 · 2019年11月5日

Improving Collaborative Metric Learning with Efficient Negative Sampling

Arxiv

3+阅读 · 2019年9月24日

Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks

Arxiv

5+阅读 · 2019年8月27日

Learning Discriminative Motion Features Through Detection

Learning Discriminative Motion Features Through Detection

Arxiv

3+阅读 · 2018年12月11日

Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年9月6日

Learning to Update for Object Tracking

Arxiv

8+阅读 · 2018年6月19日

微信扫码咨询专知VIP会员