几乎完全散装避免在没有弯曲梯度假设的情况下采用粉碎梯度方法 (Almost Sure Saddle Avoidance of Stochastic Gradient Methods without the Bounded Gradient Assumption) - 专知论文

会员服务 ·

0

几乎必然 · 随机梯度下降 · SGD · 经验风险 · 经验风险最小化 ·

2023 年 2 月 15 日

Almost Sure Saddle Avoidance of Stochastic Gradient Methods without the Bounded Gradient Assumption

翻译：几乎完全散装避免在没有弯曲梯度假设的情况下采用粉碎梯度方法

Jun Liu,Ye Yuan

We prove that various stochastic gradient descent methods, including the stochastic gradient descent (SGD), stochastic heavy-ball (SHB), and stochastic Nesterov's accelerated gradient (SNAG) methods, almost surely avoid any strict saddle manifold. To the best of our knowledge, this is the first time such results are obtained for SHB and SNAG methods. Moreover, our analysis expands upon previous studies on SGD by removing the need for bounded gradients of the objective function and uniformly bounded noise. Instead, we introduce a more practical local boundedness assumption for the noisy gradient, which is naturally satisfied in empirical risk minimization problems typically seen in training of neural networks.

翻译：根据我们所知,这是第一次在SHB和SNAG方法中取得这种结果。此外,我们的分析扩大了以往关于SGD的研究的范围,消除了对目标功能受约束的梯度和统一约束的噪音的需要。相反,我们为噪音梯度引入了更实际的本地约束性假设,这自然满足了在神经网络培训中常见的经验风险最小化问题。

0

相关内容

几乎必然

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

中国淡水桥弯藻（Cymbelloid）植物分类学研究

国家自然科学基金

1+阅读 · 2014年12月31日

中国人甲状腺癌中TERT启动子的突变

国家自然科学基金

0+阅读 · 2014年12月31日

基于凯莱图的互连网络有效控制集与自同构群研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calreticulin突变在JAK2 V617F阴性的骨髓增殖性肿瘤中的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ⅰ型胶原在股骨头坏死塌陷过程中的变化及其调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

用外显子组捕获测序技术鉴定Olmsted型掌跖角化症的致病基因

国家自然科学基金

0+阅读 · 2011年12月31日

宽带频谱压缩感知与自适应分配算法

国家自然科学基金

0+阅读 · 2011年12月31日

神经元凋亡时Egr1对BH3-only蛋白Bim的转录调控

国家自然科学基金

0+阅读 · 2009年12月31日

Curcumin双向调控HO-1/HO-2协同抑制Aβeme复合物防治AD的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

PPARγSK3β20449;号通路在TZDs类抗糖尿病药物抑制成骨细胞骨形成中的机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

On the Stochasticity of Reanalysis Outputs of 4D-Var

Arxiv

0+阅读 · 2023年4月7日

Numerical methods for backward stochastic differential equations: A survey

Arxiv

0+阅读 · 2023年4月7日

Local Differential Privacy in Federated Optimization

Arxiv

0+阅读 · 2023年4月4日

Asynchronous Iterations in Optimization: New Sequence Results and Sharper Algorithmic Guarantees

Arxiv

0+阅读 · 2023年4月3日

On the Concentration of the Minimizers of Empirical Risks

Arxiv

0+阅读 · 2023年4月3日

Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning

Arxiv

0+阅读 · 2023年4月3日

Convergence analysis of the Monte Carlo method for random Navier--Stokes--Fourier system

Arxiv

0+阅读 · 2023年4月2日

Optimal Algorithms for Decentralized Stochastic Variational Inequalities

Arxiv

0+阅读 · 2023年4月2日

Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees

Arxiv

0+阅读 · 2023年4月2日

Decentralized Local Stochastic Extra-Gradient for Variational Inequalities

Arxiv

0+阅读 · 2023年4月2日

VIP会员

文章信息

相关主题

随机梯度下降

经验风险最小化

相关VIP内容

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

【ICDM 2022教程】图挖掘中的公平性:度量、算法和应用

专知会员服务

28+阅读 · 2022年12月26日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

On the Stochasticity of Reanalysis Outputs of 4D-Var

Arxiv

0+阅读 · 2023年4月7日

Numerical methods for backward stochastic differential equations: A survey

Arxiv

0+阅读 · 2023年4月7日

Local Differential Privacy in Federated Optimization

Arxiv

0+阅读 · 2023年4月4日

Asynchronous Iterations in Optimization: New Sequence Results and Sharper Algorithmic Guarantees

Arxiv

0+阅读 · 2023年4月3日

On the Concentration of the Minimizers of Empirical Risks

Arxiv

0+阅读 · 2023年4月3日

Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning

Arxiv

0+阅读 · 2023年4月3日

Convergence analysis of the Monte Carlo method for random Navier--Stokes--Fourier system

Arxiv

0+阅读 · 2023年4月2日

Optimal Algorithms for Decentralized Stochastic Variational Inequalities

Arxiv

0+阅读 · 2023年4月2日

Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees

Arxiv

0+阅读 · 2023年4月2日

Decentralized Local Stochastic Extra-Gradient for Variational Inequalities

Arxiv

0+阅读 · 2023年4月2日

相关基金

中国淡水桥弯藻（Cymbelloid）植物分类学研究

国家自然科学基金

1+阅读 · 2014年12月31日

中国人甲状腺癌中TERT启动子的突变

国家自然科学基金

0+阅读 · 2014年12月31日

基于凯莱图的互连网络有效控制集与自同构群研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calreticulin突变在JAK2 V617F阴性的骨髓增殖性肿瘤中的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Ⅰ型胶原在股骨头坏死塌陷过程中的变化及其调控机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

用外显子组捕获测序技术鉴定Olmsted型掌跖角化症的致病基因

国家自然科学基金

0+阅读 · 2011年12月31日

宽带频谱压缩感知与自适应分配算法

国家自然科学基金

0+阅读 · 2011年12月31日

神经元凋亡时Egr1对BH3-only蛋白Bim的转录调控

国家自然科学基金

0+阅读 · 2009年12月31日

Curcumin双向调控HO-1/HO-2协同抑制Aβeme复合物防治AD的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

PPARγSK3β20449;号通路在TZDs类抗糖尿病药物抑制成骨细胞骨形成中的机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员