新升级:具有适应性学习率的梯度下降 (Neograd: gradient descent with an adaptive learning rate) - 专知论文

会员服务 ·

0

学习率 · 可辨认的 · Guidance · 估计/估计量 · 学成 ·

2020 年 10 月 15 日

Neograd: gradient descent with an adaptive learning rate

翻译：新升级:具有适应性学习率的梯度下降

Michael F. Zimmer

from arxiv, 41 pages, 23 figures

Since its inception by Cauchy in 1847, the gradient descent algorithm has been without guidance as to how to efficiently set the learning rate. This paper identifies a concept, defines metrics, and introduces algorithms to provide such guidance. The result is a family of algorithms (Neograd) based on a {\em constant $\rho$ ansatz}, where $\rho$ is a metric based on the error of the updates. This allows one to adjust the learning rate at each step, using a formulaic estimate based on $\rho$. It is now no longer necessary to do trial runs beforehand to estimate a single learning rate for an entire optimization run. The additional costs to operate this metric are trivial. One member of this family of algorithms, NeogradM, can quickly reach much lower cost function values than other first order algorithms. Comparisons are made mainly between NeogradM and Adam on an array of test functions and on a neural network model for identifying hand-written digits. The results show great performance improvements with NeogradM.

翻译：Cauchy于1847年启用了梯度下降算法,但自该算法以来,在如何有效设定学习率方面一直没有指导。本文确定了一个概念,定义了衡量标准,并引入了算法来提供这种指导。结果产生了一个基于 $$\rho$ ansatz} 的算法(Neograd) 的组合(Neograd), 这个组合的成本值比其他一阶算法要低得多。比较主要是NegradM和Adam在一系列测试功能上和在确定手写数字的神经网络模型上进行的。结果显示与NeogradM的显著性能改进。

0

相关内容

学习率

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

PyTorch深度学习零基础入门《First steps towards Deep Learning with pyTorch》

PyTorch深度学习零基础入门《First steps towards Deep Learning with pyTorch》

专知会员服务

120+阅读 · 2019年10月28日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

已删除

将门创投

5+阅读 · 2018年11月15日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Carathéodory Sampling for Stochastic Gradient Descent

Arxiv

0+阅读 · 2020年11月25日

The duality structure gradient descent algorithm: analysis and applications to neural networks

Arxiv

0+阅读 · 2020年11月25日

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

Arxiv

0+阅读 · 2020年11月25日

Adam$^+$: A Stochastic Method with Adaptive Variance Reduction

Arxiv

0+阅读 · 2020年11月24日

Adai: Separating the Effects of Adaptive Learning Rate and Momentum Inertia

Arxiv

0+阅读 · 2020年11月24日

Shuffling Gradient-Based Methods with Momentum

Arxiv

0+阅读 · 2020年11月24日

Distributed Deep Reinforcement Learning: An Overview

Arxiv

0+阅读 · 2020年11月22日

Federated Generalized Bayesian Learning via Distributed Stein Variational Gradient Descent

Arxiv

0+阅读 · 2020年11月20日

WNGrad: Learn the Learning Rate in Gradient Descent

Arxiv

0+阅读 · 2020年11月19日

Improving Bayesian Network Structure Learning in the Presence of Measurement Error

Arxiv

0+阅读 · 2020年11月19日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

哥伦比亚大学最新《机器学习》课程，Fall-B 2020 (Machine Learning)

专知会员服务

39+阅读 · 2020年11月3日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

PyTorch深度学习零基础入门《First steps towards Deep Learning with pyTorch》

PyTorch深度学习零基础入门《First steps towards Deep Learning with pyTorch》

专知会员服务

120+阅读 · 2019年10月28日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

已删除

将门创投

5+阅读 · 2018年11月15日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Carathéodory Sampling for Stochastic Gradient Descent

Arxiv

0+阅读 · 2020年11月25日

The duality structure gradient descent algorithm: analysis and applications to neural networks

Arxiv

0+阅读 · 2020年11月25日

ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning

Arxiv

0+阅读 · 2020年11月25日

Adam$^+$: A Stochastic Method with Adaptive Variance Reduction

Arxiv

0+阅读 · 2020年11月24日

Adai: Separating the Effects of Adaptive Learning Rate and Momentum Inertia

Arxiv

0+阅读 · 2020年11月24日

Shuffling Gradient-Based Methods with Momentum

Arxiv

0+阅读 · 2020年11月24日

Distributed Deep Reinforcement Learning: An Overview

Arxiv

0+阅读 · 2020年11月22日

Federated Generalized Bayesian Learning via Distributed Stein Variational Gradient Descent

Arxiv

0+阅读 · 2020年11月20日

WNGrad: Learn the Learning Rate in Gradient Descent

Arxiv

0+阅读 · 2020年11月19日

Improving Bayesian Network Structure Learning in the Presence of Measurement Error

Arxiv

0+阅读 · 2020年11月19日

微信扫码咨询专知VIP会员