积极-消极动力:操纵蒸汽梯级噪音,改进普遍化 (Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization) - 专知论文

会员服务 ·

0

泛化理论 · 动量 · Learning · 噪声 · SGD ·

2022 年 8 月 30 日

Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization

翻译：积极-消极动力:操纵蒸汽梯级噪音,改进普遍化

Zeke Xie,Li Yuan,Zhanxing Zhu,Masashi Sugiyama

from arxiv, ICML 2021; 20 pages; 13 figures; We fixed some typos in the updated version

It is well-known that stochastic gradient noise (SGN) acts as implicit regularization for deep learning and is essentially important for both optimization and generalization of deep networks. Some works attempted to artificially simulate SGN by injecting random noise to improve deep learning. However, it turned out that the injected simple random noise cannot work as well as SGN, which is anisotropic and parameter-dependent. For simulating SGN at low computational costs and without changing the learning rate or batch size, we propose the Positive-Negative Momentum (PNM) approach that is a powerful alternative to conventional Momentum in classic optimizers. The introduced PNM method maintains two approximate independent momentum terms. Then, we can control the magnitude of SGN explicitly by adjusting the momentum difference. We theoretically prove the convergence guarantee and the generalization advantage of PNM over Stochastic Gradient Descent (SGD). By incorporating PNM into the two conventional optimizers, SGD with Momentum and Adam, our extensive experiments empirically verified the significant advantage of the PNM-based variants over the corresponding conventional Momentum-based optimizers.

翻译：众所周知,悬浮梯度噪音(SGN)是深层学习的隐性规范,对深层网络的优化和普及都十分重要,有些作品试图通过注入随机噪音来人工模拟SGN,以改进深层学习,然而,结果发现,注入的简单随机噪音不能与SGN一样有效,因为SGN是厌养和依赖参数的。为了以低计算成本模拟SGN,而不改变学习率或批量大小,我们提议采用积极-阴性运动(PNM)方法,这是传统优化器中传统潮流的强大替代物。引入的PNM方法保留了两个大致独立的动力条件。然后,我们可以通过调整动力差异来明确控制SGN的规模。我们理论上证明PNM的趋同性保证和普遍优势对斯托切氏基因梯发源(SGD)的影响。我们将PNM与Momentum和Adam这两个常规优化器相结合,我们的广泛实验从经验上证实了PNM的变异体相对于相应的常规潮流优化器的重大优势。

0

相关内容

泛化理论

【重磅】2022年IEEE Fellow出炉！ 310位新晋升会士！王海峰、田永鸿、汪玉、申恒涛等七十九位华人当选！

【重磅】2022年IEEE Fellow出炉！ 310位新晋升会士！王海峰、田永鸿、汪玉、申恒涛等七十九位华人当选！

专知会员服务

7+阅读 · 2021年11月24日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【重磅】2021年IEEE Fellow出炉！ 282位新晋升会士！七十多位华人当选！

专知会员服务

23+阅读 · 2020年11月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Mipu1促血管新生的机制研究：对VEGF-VASH1/SVBP负反馈通路的转录调节

国家自然科学基金

0+阅读 · 2014年12月31日

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Degasperis-Procesi方程若干控制问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

双靶点抑制c-met和VEGFR2治疗高侵袭性肝细胞癌及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

EAST低杂波电流驱动等离子体数值模拟及实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

光子晶格中的孤子波和缺陷模研究

国家自然科学基金

0+阅读 · 2009年12月31日

随机磁场的产生及其与托卡马克等离子体相互作用的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

通过抑制Fli-1的转录活性探讨其在肿瘤发生中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization

Arxiv

0+阅读 · 2022年10月17日

On the convergence of policy gradient methods to Nash equilibria in general stochastic games

Arxiv

0+阅读 · 2022年10月17日

Stochastic Differentially Private and Fair Learning

Arxiv

0+阅读 · 2022年10月17日

The alignment property of SGD noise and how it helps select flat minima: A stability analysis

Arxiv

0+阅读 · 2022年10月17日

Revisiting Optimal Convergence Rate for Smooth and Non-convex Stochastic Decentralized Optimization

Arxiv

0+阅读 · 2022年10月14日

Scalable Stochastic Parametric Verification with Stochastic Variational Smoothed Model Checking

Arxiv

0+阅读 · 2022年10月14日

Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power

Arxiv

0+阅读 · 2022年10月14日

Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence

Arxiv

0+阅读 · 2022年10月13日

Rigorous dynamical mean field theory for stochastic gradient descent methods

Arxiv

0+阅读 · 2022年10月12日

How to train your MAML

Arxiv

26+阅读 · 2019年3月5日

VIP会员

文章信息

相关主题

相关VIP内容

【重磅】2022年IEEE Fellow出炉！ 310位新晋升会士！王海峰、田永鸿、汪玉、申恒涛等七十九位华人当选！

【重磅】2022年IEEE Fellow出炉！ 310位新晋升会士！王海峰、田永鸿、汪玉、申恒涛等七十九位华人当选！

专知会员服务

7+阅读 · 2021年11月24日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【重磅】2021年IEEE Fellow出炉！ 282位新晋升会士！七十多位华人当选！

专知会员服务

23+阅读 · 2020年11月25日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

美陆军五大转型方向

一种Agent自主性风险评估框架 | 最新文献

实时无人机指令处理：一种面向无人机系统的大语言模型方法

基于动态知识图谱的人工智能代理自主研究周期 | 文献

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Gradient-Free Methods for Deterministic and Stochastic Nonsmooth Nonconvex Optimization

Arxiv

0+阅读 · 2022年10月17日

On the convergence of policy gradient methods to Nash equilibria in general stochastic games

Arxiv

0+阅读 · 2022年10月17日

Stochastic Differentially Private and Fair Learning

Arxiv

0+阅读 · 2022年10月17日

The alignment property of SGD noise and how it helps select flat minima: A stability analysis

Arxiv

0+阅读 · 2022年10月17日

Revisiting Optimal Convergence Rate for Smooth and Non-convex Stochastic Decentralized Optimization

Arxiv

0+阅读 · 2022年10月14日

Scalable Stochastic Parametric Verification with Stochastic Variational Smoothed Model Checking

Arxiv

0+阅读 · 2022年10月14日

Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power

Arxiv

0+阅读 · 2022年10月14日

Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence

Arxiv

0+阅读 · 2022年10月13日

Rigorous dynamical mean field theory for stochastic gradient descent methods

Arxiv

0+阅读 · 2022年10月12日

How to train your MAML

Arxiv

26+阅读 · 2019年3月5日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Mipu1促血管新生的机制研究：对VEGF-VASH1/SVBP负反馈通路的转录调节

国家自然科学基金

0+阅读 · 2014年12月31日

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

Degasperis-Procesi方程若干控制问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

双靶点抑制c-met和VEGFR2治疗高侵袭性肝细胞癌及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

EAST低杂波电流驱动等离子体数值模拟及实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

光子晶格中的孤子波和缺陷模研究

国家自然科学基金

0+阅读 · 2009年12月31日

随机磁场的产生及其与托卡马克等离子体相互作用的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

UGT基因簇进化及调控研究

国家自然科学基金

0+阅读 · 2009年12月31日

通过抑制Fli-1的转录活性探讨其在肿瘤发生中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员