SGEM:具有能量和动力的随机梯度 (SGEM: stochastic gradient with energy and momentum) - 专知论文

会员服务 ·

0

动量 · 情景 · 非凸 · 优化器 · Better ·

2022 年 8 月 3 日

SGEM: stochastic gradient with energy and momentum

翻译：SGEM:具有能量和动力的随机梯度

Hailiang Liu,Xuping Tian

from arxiv, 24 pages, 4 figures

In this paper, we propose SGEM, Stochastic Gradient with Energy and Momentum, to solve a large class of general non-convex stochastic optimization problems, based on the AEGD method that originated in the work [AEGD: Adaptive Gradient Descent with Energy. arXiv: 2010.05109]. SGEM incorporates both energy and momentum at the same time so as to inherit their dual advantages. We show that SGEM features an unconditional energy stability property, and derive energy-dependent convergence rates in the general nonconvex stochastic setting, as well as a regret bound in the online convex setting. A lower threshold for the energy variable is also provided. Our experimental results show that SGEM converges faster than AEGD and generalizes better or at least as well as SGDM in training some deep neural networks.

翻译：在本文中,我们建议SGEM, " 与能量和动力的蒸汽梯度梯度梯度梯度梯度梯度梯度梯度梯度 ",以基于工作[AEGD:与能量相适应的梯度梯度梯度梯度[AEGD:与能量相适应的梯度梯度梯度梯度梯度梯度:2010/05109]的AEGD方法为基础,解决一大批一般的非凝固性非凝固性优化问题。SSGEM同时结合了能量和动力,以继承其双重优势。我们表明SGEM具有无条件的能源稳定性特性,在一般的非凝固性定位中得出了依赖能源的趋同率,并在在线凝固器设置中得出了遗憾。还提供了能源变量下限。我们的实验结果表明,SGEM比AEGD更快地结合,在培训一些深层神经网络时,总化或至少是SGDMM的更好或更普遍化。

0

相关内容

动量方法 (Polyak, 1964) 旨在加速学习，特别是处理高曲率、小但一致的梯度，或是带噪声的梯度。动量算法积累了之前梯度指数级衰减的移动平均，并且继续沿该方向移动。

【2022新书】Python数据分析第三版，579页pdf

【2022新书】Python数据分析第三版，579页pdf

专知会员服务

252+阅读 · 2022年8月31日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

GPC3嵌合抗原受体基因修饰的T细胞靶向治疗肝细胞癌的研究

国家自然科学基金

0+阅读 · 2014年12月31日

集值优化问题的逼近解及二阶最优性条件

国家自然科学基金

0+阅读 · 2014年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

雌激素/雌激素受体-RUNX1-miR-29家族-OX40、ICOS调控通路在记忆T细胞介导的移植免疫中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

用于<=11nm超细微结构制备的聚苯乙烯-聚(alpha-羟基羧酸)嵌段共聚物的引导组装

国家自然科学基金

1+阅读 · 2012年12月31日

PARP-1/AIF信号通路在重离子诱导神经细胞凋亡中的调控作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

牵张力作用下细胞骨架肌动蛋白结合蛋白Girdin触发PI3K/Akt信号通路调控正畸牙周组织改建的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models

Arxiv

0+阅读 · 2022年9月30日

Statistical Learning and Inverse Problems: An Stochastic Gradient Approach

Arxiv

0+阅读 · 2022年9月30日

Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training

Arxiv

0+阅读 · 2022年9月29日

NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers

Arxiv

0+阅读 · 2022年9月29日

Joint Optimization of Energy Consumption and Completion Time in Federated Learning

Arxiv

0+阅读 · 2022年9月29日

Gradient flows and randomised thresholding: sparse inversion and classification

Arxiv

0+阅读 · 2022年9月29日

On the Input-Output Behavior of a Geothermal Energy Storage: Approximations by Model Order Reduction

Arxiv

0+阅读 · 2022年9月29日

On the influence of stochastic roundoff errors on the convergence of the gradient descent method with low-precision floating-point computation

Arxiv

0+阅读 · 2022年9月28日

A deep learning approach for the computation of curvature in the level-set method

Arxiv

0+阅读 · 2022年9月28日

A hybrid inference system for improved curvature estimation in the level-set method using machine learning

Arxiv

0+阅读 · 2022年9月28日

VIP会员

文章信息

相关主题

相关VIP内容

【2022新书】Python数据分析第三版，579页pdf

【2022新书】Python数据分析第三版，579页pdf

专知会员服务

252+阅读 · 2022年8月31日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

相关论文

Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models

Arxiv

0+阅读 · 2022年9月30日

Statistical Learning and Inverse Problems: An Stochastic Gradient Approach

Arxiv

0+阅读 · 2022年9月30日

Zeus: Understanding and Optimizing GPU Energy Consumption of DNN Training

Arxiv

0+阅读 · 2022年9月29日

NAG-GS: Semi-Implicit, Accelerated and Robust Stochastic Optimizers

Arxiv

0+阅读 · 2022年9月29日

Joint Optimization of Energy Consumption and Completion Time in Federated Learning

Arxiv

0+阅读 · 2022年9月29日

Gradient flows and randomised thresholding: sparse inversion and classification

Arxiv

0+阅读 · 2022年9月29日

On the Input-Output Behavior of a Geothermal Energy Storage: Approximations by Model Order Reduction

Arxiv

0+阅读 · 2022年9月29日

On the influence of stochastic roundoff errors on the convergence of the gradient descent method with low-precision floating-point computation

Arxiv

0+阅读 · 2022年9月28日

A deep learning approach for the computation of curvature in the level-set method

Arxiv

0+阅读 · 2022年9月28日

A hybrid inference system for improved curvature estimation in the level-set method using machine learning

Arxiv

0+阅读 · 2022年9月28日

相关基金

GPC3嵌合抗原受体基因修饰的T细胞靶向治疗肝细胞癌的研究

国家自然科学基金

0+阅读 · 2014年12月31日

集值优化问题的逼近解及二阶最优性条件

国家自然科学基金

0+阅读 · 2014年12月31日

S3AGA样本（Spitzer-SDSS Spectral Atlas of Galaxies and AGNs)及其AGN研究

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

雌激素/雌激素受体-RUNX1-miR-29家族-OX40、ICOS调控通路在记忆T细胞介导的移植免疫中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

用于<=11nm超细微结构制备的聚苯乙烯-聚(alpha-羟基羧酸)嵌段共聚物的引导组装

国家自然科学基金

1+阅读 · 2012年12月31日

PARP-1/AIF信号通路在重离子诱导神经细胞凋亡中的调控作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

牵张力作用下细胞骨架肌动蛋白结合蛋白Girdin触发PI3K/Akt信号通路调控正畸牙周组织改建的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员