具有数据依赖数据培训的动态学习率 (Training With Data Dependent Dynamic Learning Rates) - 专知论文

会员服务 ·

0

学成 · 学习率 · 优化器 · 损失函数（机器学习） · 示例 ·

2021 年 5 月 27 日

Training With Data Dependent Dynamic Learning Rates

翻译：具有数据依赖数据培训的动态学习率

Shreyas Saxena,Nidhi Vyas,Dennis DeCoste

Recently many first and second order variants of SGD have been proposed to facilitate training of Deep Neural Networks (DNNs). A common limitation of these works stem from the fact that they use the same learning rate across all instances present in the dataset. This setting is widely adopted under the assumption that loss functions for each instance are similar in nature, and hence, a common learning rate can be used. In this work, we relax this assumption and propose an optimization framework which accounts for difference in loss function characteristics across instances. More specifically, our optimizer learns a dynamic learning rate for each instance present in the dataset. Learning a dynamic learning rate for each instance allows our optimization framework to focus on different modes of training data during optimization. When applied to an image classification task, across different CNN architectures, learning dynamic learning rates leads to consistent gains over standard optimizers. When applied to a dataset containing corrupt instances, our framework reduces the learning rates on noisy instances, and improves over the state-of-the-art. Finally, we show that our optimization framework can be used for personalization of a machine learning model towards a known targeted data distribution.

翻译：最近,为了便利深神经网络(DNN)的培训,提出了SGD最近的许多第一和第二顺序变体。这些作品的一个共同限制是,它们使用的学习率在数据集中的所有实例中都是相同的。这种设置被广泛采用,假设每个实例的损失函数性质相似,因此可以使用共同的学习率。在这项工作中,我们放松这一假设,提出一个优化框架,以计算不同实例的损失函数特征的差异。更具体地说,我们的优化者学习数据集中每个实例的动态学习率。学习每个实例的动态学习率使得我们的优化框架能够在优化期间侧重于不同的培训数据模式。当应用到图像分类任务时,在不同的CNN结构中,学习动态学习率导致标准优化器的一致收益。在应用包含腐败实例的数据集时,我们的框架会降低噪音实例的学习率,并改进最新技术。最后,我们表明,我们的优化框架可以用于机器学习模型的个人化,从而实现已知的目标数据分配。

0

相关内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【目标跟踪 | 2019最新综述】多目标追踪综述，附38页PDF，185篇参考文献，Deep Learning in Video Multi-Object Tracking: A Survey

【目标跟踪 | 2019最新综述】多目标追踪综述，附38页PDF，185篇参考文献，Deep Learning in Video Multi-Object Tracking: A Survey

专知会员服务

93+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Zero-Shot Learning相关资源大列表

Zero-Shot Learning相关资源大列表

专知

52+阅读 · 2019年1月1日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

深度学习与NLP

15+阅读 · 2018年9月8日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent

Arxiv

0+阅读 · 2021年7月20日

Model-Agnostic Learning to Meta-Learn

Arxiv

1+阅读 · 2021年7月19日

Self-Supervised Graph Learning with Proximity-based Views and Channel Contrast

Arxiv

0+阅读 · 2021年7月19日

Dissecting Supervised Constrastive Learning

Arxiv

11+阅读 · 2021年2月17日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Arxiv

3+阅读 · 2019年9月25日

End to end learning and optimization on graphs

Arxiv

7+阅读 · 2019年5月31日

A Survey on Deep Transfer Learning

A Survey on Deep Transfer Learning

Arxiv

11+阅读 · 2018年8月6日

Learning Unsupervised Learning Rules

Arxiv

7+阅读 · 2018年5月23日

Learning a Deep Listwise Context Model for Ranking Refinement

Arxiv

4+阅读 · 2018年4月23日

VIP会员

文章信息

相关主题

损失函数（机器学习）

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

【Google】监督对比学习，Supervised Contrastive Learning

【Google】监督对比学习，Supervised Contrastive Learning

专知会员服务

75+阅读 · 2020年4月24日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【目标跟踪 | 2019最新综述】多目标追踪综述，附38页PDF，185篇参考文献，Deep Learning in Video Multi-Object Tracking: A Survey

【目标跟踪 | 2019最新综述】多目标追踪综述，附38页PDF，185篇参考文献，Deep Learning in Video Multi-Object Tracking: A Survey

专知会员服务

93+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

三维高斯泼溅应用综述：分割、编辑与生成

《多智能体不确定环境追逃博弈研究》216页

【博士论文】基于不确定性的可靠性：现代机器学习中的选择性预测与可信部署

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Zero-Shot Learning相关资源大列表

Zero-Shot Learning相关资源大列表

专知

52+阅读 · 2019年1月1日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

AI实战圣经《Machine Learning Yearning》第1-52章中英文版pdf分享

深度学习与NLP

15+阅读 · 2018年9月8日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

相关论文

Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent

Arxiv

0+阅读 · 2021年7月20日

Model-Agnostic Learning to Meta-Learn

Arxiv

1+阅读 · 2021年7月19日

Self-Supervised Graph Learning with Proximity-based Views and Channel Contrast

Arxiv

0+阅读 · 2021年7月19日

Dissecting Supervised Constrastive Learning

Arxiv

11+阅读 · 2021年2月17日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes

Arxiv

3+阅读 · 2019年9月25日

End to end learning and optimization on graphs

Arxiv

7+阅读 · 2019年5月31日

A Survey on Deep Transfer Learning

A Survey on Deep Transfer Learning

Arxiv

11+阅读 · 2018年8月6日

Learning Unsupervised Learning Rules

Arxiv

7+阅读 · 2018年5月23日

Learning a Deep Listwise Context Model for Ranking Refinement

Arxiv

4+阅读 · 2018年4月23日

微信扫码咨询专知VIP会员