斯托切斯渐变后代的有效噪音 (The effective noise of Stochastic Gradient Descent) - 专知论文

会员服务 ·

0

SGD · 噪声 · 随机梯度下降 · 训练误差 · Learning ·

2022 年 6 月 1 日

The effective noise of Stochastic Gradient Descent

翻译：斯托切斯渐变后代的有效噪音

Francesca Mignacco,Pierfrancesco Urbani

from arxiv, 7 pages + appendix, 5 figures

Stochastic Gradient Descent (SGD) is the workhorse algorithm of deep learning technology. At each step of the training phase, a mini batch of samples is drawn from the training dataset and the weights of the neural network are adjusted according to the performance on this specific subset of examples. The mini-batch sampling procedure introduces a stochastic dynamics to the gradient descent, with a non-trivial state-dependent noise. We characterize the stochasticity of SGD and a recently-introduced variant, \emph{persistent} SGD, in a prototypical neural network model. In the under-parametrized regime, where the final training error is positive, the SGD dynamics reaches a stationary state and we define an effective temperature from the fluctuation-dissipation theorem, computed from dynamical mean-field theory. We use the effective temperature to quantify the magnitude of the SGD noise as a function of the problem parameters. In the over-parametrized regime, where the training error vanishes, we measure the noise magnitude of SGD by computing the average distance between two replicas of the system with the same initialization and two different realizations of SGD noise. We find that the two noise measures behave similarly as a function of the problem parameters. Moreover, we observe that noisier algorithms lead to wider decision boundaries of the corresponding constraint satisfaction problem.

翻译：深层学习技术的工作马算法(SGD)是深层学习技术的演算法。在培训阶段的每个阶段,从培训数据集中抽取一小批样本,神经网络的重量根据这个特定实例组的性能进行调整。微型批量抽样程序将一个随机动态带到梯度下降,有非三角状态依赖的噪音。我们把SGD的随机性与最近引入的变异性(emph{persistant} SGD)描述成一个模拟神经网络模型。在最后培训错误为正数的对称系统中,SGD动态会达到一个固定状态,我们根据动态平均场理论计算出一种从波动-分散的正数下降的有效的温度。我们用有效温度量化SGD噪声的大小,作为问题参数的函数。在过度平衡制度中,如果训练错误消失,我们通过计算SGD的两次噪声度参数,我们测量SGD的噪音程度的两次比值,我们通过计算两种平均距离来测量SGD的伸缩度。我们用两种测测测测测的系统是否具有同一程度。

0

相关内容

SGD

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

逻辑等价算子在不确定性推理中的应用

国家自然科学基金

1+阅读 · 2015年12月31日

基于Metasurface的THz慢波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

内燃机核基动态监测诊断方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

循环甲基化抑癌基因定量用于NSCLC化疗疗效评估的意义

国家自然科学基金

0+阅读 · 2012年12月31日

多元非晶态合金纳米管的可控制备及催化性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

稀土钴基永磁合金的高温矫顽力机制和反常膨胀现象

国家自然科学基金

0+阅读 · 2011年12月31日

光相干层析成像研究血液凝固过程中的光学性质动态变化及特征参数

国家自然科学基金

0+阅读 · 2011年12月31日

由Janus胶束构筑具有不对称结构的金属-金属氧化物纳米粒子

国家自然科学基金

0+阅读 · 2011年12月31日

nanog对牙髓干细胞增殖分化的影响及信号通路调控

国家自然科学基金

0+阅读 · 2011年12月31日

基于红外光声效应的土壤粘土矿物-多糖界面反应及界面反应层特征研究

国家自然科学基金

0+阅读 · 2008年12月31日

Riemannian Stochastic Gradient Method for Nested Composition Optimization

Arxiv

0+阅读 · 2022年7月19日

The Implicit Bias of Gradient Descent on Separable Data

Arxiv

0+阅读 · 2022年7月19日

Lazy Estimation of Variable Importance for Large Neural Networks

Arxiv

0+阅读 · 2022年7月19日

The Stochastic Bilevel Continuous Knapsack Problem with Uncertain Follower's Objective

Arxiv

0+阅读 · 2022年7月18日

Structured Stochastic Gradient MCMC

Arxiv

0+阅读 · 2022年7月18日

Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence

Arxiv

0+阅读 · 2022年7月18日

Compression of user generated content using denoised references

Arxiv

0+阅读 · 2022年7月18日

Single Model Uncertainty Estimation via Stochastic Data Centering

Arxiv

0+阅读 · 2022年7月14日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

VIP会员

文章信息

相关主题

随机梯度下降

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

GPT-5如何对齐？从硬性拒绝到安全完成：走向以输出为中心的安全训练

【伯克利博士论文】超越人类监督的视觉智能

【ICCV2025】SO(3) 上连续非保守动力系统的预测

2025年中国数据要素行业发展研究报告

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Riemannian Stochastic Gradient Method for Nested Composition Optimization

Arxiv

0+阅读 · 2022年7月19日

The Implicit Bias of Gradient Descent on Separable Data

Arxiv

0+阅读 · 2022年7月19日

Lazy Estimation of Variable Importance for Large Neural Networks

Arxiv

0+阅读 · 2022年7月19日

The Stochastic Bilevel Continuous Knapsack Problem with Uncertain Follower's Objective

Arxiv

0+阅读 · 2022年7月18日

Structured Stochastic Gradient MCMC

Arxiv

0+阅读 · 2022年7月18日

Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence

Arxiv

0+阅读 · 2022年7月18日

Compression of user generated content using denoised references

Arxiv

0+阅读 · 2022年7月18日

Single Model Uncertainty Estimation via Stochastic Data Centering

Arxiv

0+阅读 · 2022年7月14日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

相关基金

逻辑等价算子在不确定性推理中的应用

国家自然科学基金

1+阅读 · 2015年12月31日

基于Metasurface的THz慢波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

内燃机核基动态监测诊断方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

循环甲基化抑癌基因定量用于NSCLC化疗疗效评估的意义

国家自然科学基金

0+阅读 · 2012年12月31日

多元非晶态合金纳米管的可控制备及催化性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

稀土钴基永磁合金的高温矫顽力机制和反常膨胀现象

国家自然科学基金

0+阅读 · 2011年12月31日

光相干层析成像研究血液凝固过程中的光学性质动态变化及特征参数

国家自然科学基金

0+阅读 · 2011年12月31日

由Janus胶束构筑具有不对称结构的金属-金属氧化物纳米粒子

国家自然科学基金

0+阅读 · 2011年12月31日

nanog对牙髓干细胞增殖分化的影响及信号通路调控

国家自然科学基金

0+阅读 · 2011年12月31日

基于红外光声效应的土壤粘土矿物-多糖界面反应及界面反应层特征研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员