非移动非convex 优化的简单和最佳斯托卡梯度方法 (Simple and Optimal Stochastic Gradient Methods for Nonsmooth Nonconvex Optimization) - 专知论文

会员服务 ·

0

优化器 · 非凸 · 极小点 · SimPLe · 平稳的 ·

2022 年 8 月 22 日

Simple and Optimal Stochastic Gradient Methods for Nonsmooth Nonconvex Optimization

翻译：非移动非convex 优化的简单和最佳斯托卡梯度方法

Zhize Li,Jian Li

from arxiv, 60 pages. To appear in JMLR. arXiv admin note: text overlap with arXiv:1904.09265

We propose and analyze several stochastic gradient algorithms for finding stationary points or local minimum in nonconvex, possibly with nonsmooth regularizer, finite-sum and online optimization problems. First, we propose a simple proximal stochastic gradient algorithm based on variance reduction called ProxSVRG+. We provide a clean and tight analysis of ProxSVRG+, which shows that it outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, hence solves an open problem proposed in Reddi et al. (2016b). Also, ProxSVRG+ uses much less proximal oracle calls than ProxSVRG (Reddi et al., 2016b) and extends to the online setting by avoiding full gradient computations. Then, we further propose an optimal algorithm, called SSRGD, based on SARAH (Nguyen et al., 2017) and show that SSRGD further improves the gradient complexity of ProxSVRG+ and achieves the optimal upper bound, matching the known lower bound of (Fang et al., 2018; Li et al., 2021). Moreover, we show that both ProxSVRG+ and SSRGD enjoy automatic adaptation with local structure of the objective function such as the Polyak-\L{}ojasiewicz (PL) condition for nonconvex functions in the finite-sum case, i.e., we prove that both of them can automatically switch to faster global linear convergence without any restart performed in prior work ProxSVRG (Reddi et al., 2016b). Finally, we focus on the more challenging problem of finding an $(\epsilon, \delta)$-local minimum instead of just finding an $\epsilon$-approximate (first-order) stationary point (which may be some bad unstable saddle points). We show that SSRGD can find an $(\epsilon, \delta)$-local minimum by simply adding some random perturbations. Our algorithm is almost as simple as its counterpart for finding stationary points, and achieves similar optimal rates.

翻译：我们提出并分析数种随机梯度算法, 以寻找固定点或本地最小值的非convex 。首先, 我们提出一个基于差异减少的简单准随机梯度算法, 名为 ProxSVRG+ 。我们对 ProxSVRG+ 进行清洁和严格的分析, 这表明它比确定性精度梯度下降( ProxGD) 更快( ProxGD), 从而解决 Reddi 等人( 2016b. 2016b. ) 提出的一个开放式问题。另外, ProxSVRG+ 使用比 ProxVRGG( Reddi等人, 2016b) 少得多的准直径梯度梯度计算。然后, 我们进一步提议一个最佳的算法, 仅仅以SAH( Nguyen et al., 201717) 为基础, 显示SSRGG 的任何梯度复杂性复杂性复杂性和最优的上限, 通过我们所知道的直径SL21 的直径直方值。

0

相关内容

优化器

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

42+阅读 · 2020年1月15日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

锂离子吸附剂Li/Al LDH-Cl 的可控合成、结构与吸附机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

GOLPH3调节Wls促进Wnt分泌调控脑胶质瘤增殖的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长效抗中毒低铂/掺杂型TiN催化剂的可控合成及甲醇氧化催化性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

水莱茵海默氏菌 (Rheinheimera aquimaris) 淬灭细菌群体感应的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA-29b介导血管平滑肌细胞AT1aR基因DNA去甲基化参与高血压发病机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

MicroRNA与新疆维、汉宫颈癌发病机制及病理生物学关系的研究

国家自然科学基金

0+阅读 · 2011年12月31日

KLF4对microRNA的调控及其在动脉粥样硬化中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

基于结构设计的稀土金属-有机配位多孔材料的合成与性质研究

国家自然科学基金

0+阅读 · 2008年12月31日

Scaling up Stochastic Gradient Descent for Non-convex Optimisation

Arxiv

0+阅读 · 2022年10月6日

Weak error analysis for the stochastic Allen-Cahn equation

Arxiv

0+阅读 · 2022年10月5日

Near-Optimal Algorithms for Making the Gradient Small in Stochastic Minimax Optimization

Arxiv

0+阅读 · 2022年10月4日

Versatile Single-Loop Method for Gradient Estimator: First and Second Order Optimality, and its Application to Federated Learning

Arxiv

0+阅读 · 2022年10月4日

Lower Complexity Bounds of Finite-Sum Optimization Problems: The Results and Construction

Arxiv

0+阅读 · 2022年10月3日

Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs

Arxiv

0+阅读 · 2022年10月2日

A Deep Conjugate Direction Method for Iteratively Solving Linear Systems

Arxiv

0+阅读 · 2022年10月1日

Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability

Arxiv

0+阅读 · 2022年9月30日

Optimal Query Complexities for Dynamic Trace Estimation

Arxiv

0+阅读 · 2022年9月30日

Ensemble-based gradient inference for particle methods in optimization and sampling

Arxiv

0+阅读 · 2022年9月23日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

42+阅读 · 2020年1月15日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

《理解城市战及其在俄乌战争中的表现》报告

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

《建设式兵棋模拟作为战术集群配置优化的关键组成部分》

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Scaling up Stochastic Gradient Descent for Non-convex Optimisation

Arxiv

0+阅读 · 2022年10月6日

Weak error analysis for the stochastic Allen-Cahn equation

Arxiv

0+阅读 · 2022年10月5日

Near-Optimal Algorithms for Making the Gradient Small in Stochastic Minimax Optimization

Arxiv

0+阅读 · 2022年10月4日

Versatile Single-Loop Method for Gradient Estimator: First and Second Order Optimality, and its Application to Federated Learning

Arxiv

0+阅读 · 2022年10月4日

Lower Complexity Bounds of Finite-Sum Optimization Problems: The Results and Construction

Arxiv

0+阅读 · 2022年10月3日

Optimistic Policy Optimization is Provably Efficient in Non-stationary MDPs

Arxiv

0+阅读 · 2022年10月2日

A Deep Conjugate Direction Method for Iteratively Solving Linear Systems

Arxiv

0+阅读 · 2022年10月1日

Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability

Arxiv

0+阅读 · 2022年9月30日

Optimal Query Complexities for Dynamic Trace Estimation

Arxiv

0+阅读 · 2022年9月30日

Ensemble-based gradient inference for particle methods in optimization and sampling

Arxiv

0+阅读 · 2022年9月23日

相关基金

锂离子吸附剂Li/Al LDH-Cl 的可控合成、结构与吸附机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

GOLPH3调节Wls促进Wnt分泌调控脑胶质瘤增殖的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长效抗中毒低铂/掺杂型TiN催化剂的可控合成及甲醇氧化催化性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

水莱茵海默氏菌 (Rheinheimera aquimaris) 淬灭细菌群体感应的机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA-29b介导血管平滑肌细胞AT1aR基因DNA去甲基化参与高血压发病机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

MicroRNA与新疆维、汉宫颈癌发病机制及病理生物学关系的研究

国家自然科学基金

0+阅读 · 2011年12月31日

KLF4对microRNA的调控及其在动脉粥样硬化中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

基于本体的Deep Web搜索技术

国家自然科学基金

2+阅读 · 2009年12月31日

基于结构设计的稀土金属-有机配位多孔材料的合成与性质研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员