通过非Convex优化中的重球动力,快速寻找一个Benign区域 (Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization) - 专知论文

会员服务 ·

0

优化器 · 动量 · SimPLe · 可辨认的 · Continuity ·

2021 年 2 月 14 日

Quickly Finding a Benign Region via Heavy Ball Momentum in Non-Convex Optimization

翻译：通过非Convex优化中的重球动力,快速寻找一个Benign区域

Jun-Kun Wang,Jacob Abernethy

The Heavy Ball Method, proposed by Polyak over five decades ago, is a first-order method for optimizing continuous functions. While its stochastic counterpart has proven extremely popular in training deep networks, there are almost no known functions where deterministic Heavy Ball is provably faster than the simple and classical gradient descent algorithm in non-convex optimization. The success of Heavy Ball has thus far eluded theoretical understanding. Our goal is to address this gap, and in the present work we identify two non-convex problems where we provably show that the Heavy Ball momentum helps the iterate to enter a benign region that contains a global optimal point faster. We show that Heavy Ball exhibits simple dynamics that clearly reveal the benefit of using a larger value of momentum parameter for the problems. The first of these optimization problems is the phase retrieval problem, which has useful applications in physical science. The second of these optimization problems is the cubic-regularized minimization, a critical subroutine required by Nesterov-Polyak cubic-regularized method to find second-order stationary points in general smooth non-convex problems.

翻译：由Polyak 五十多年前提出的重球法是优化连续功能的第一阶方法。虽然在深层网络的培训中,其随机应变的对应方已证明非常受欢迎, 但几乎没有任何已知的功能, 其确定性重球比非凝固器优化中的简单和经典梯度下限算法更快。重球的成功至今尚未在理论上获得理解。我们的目标是解决这一差距, 在目前的工作中, 我们发现两个非调和性的问题, 在那里, 我们可以看到重球的动力有助于它进入一个包含全球最佳点的良性区域。我们显示, 重球展示了简单的动态, 清楚地揭示了使用更大的动力参数来解决问题的好处。这些优化问题中的第一个是阶段的检索问题, 它在物理科学中具有有益的应用。这些优化问题的第二点是三次固定式最小化的最小化, 这是Nesterov- Polyak 立正态的立正式方法所需要的一种关键的次路径, 以便找到一般平稳的非凝固问题中的第二阶定点。

0

相关内容

优化器

【斯坦福CS224N硬核课】问答系统，陈丹琦讲解，79页ppt

【斯坦福CS224N硬核课】问答系统，陈丹琦讲解，79页ppt

专知会员服务

74+阅读 · 2021年2月23日

Python编程基础，121页ppt

Python编程基础，121页ppt

专知会员服务

49+阅读 · 2021年1月1日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

专知会员服务

18+阅读 · 2020年3月3日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ICML2019：Google和Facebook在推进哪些方向？

ICML2019：Google和Facebook在推进哪些方向？

专知

5+阅读 · 2019年6月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机视觉的不同任务

计算机视觉的不同任务

专知

5+阅读 · 2018年8月27日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Exact Stochastic Second Order Deep Learning

Exact Stochastic Second Order Deep Learning

Arxiv

0+阅读 · 2021年4月8日

Escaping Saddle Points for Nonsmooth Weakly Convex Functions via Perturbed Proximal Algorithms

Arxiv

0+阅读 · 2021年4月8日

Accelerated derivative-free nonlinear least-squares applied to the estimation of Manning coefficients

Arxiv

0+阅读 · 2021年4月6日

The Power of Subsampling in Submodular Maximization

Arxiv

0+阅读 · 2021年4月6日

Complete Dictionary Learning via $\ell^4$-Norm Maximization over the Orthogonal Group

Arxiv

0+阅读 · 2021年4月6日

A Survey on Recent Progress in the Theory of Evolutionary Algorithms for Discrete Optimization

Arxiv

0+阅读 · 2021年4月6日

Non-parametric Quantile Regression via the K-NN Fused Lasso

Arxiv

0+阅读 · 2021年4月6日

A Caputo fractional derivative-based algorithm for optimization

Arxiv

0+阅读 · 2021年4月6日

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Arxiv

3+阅读 · 2018年10月1日

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Arxiv

9+阅读 · 2018年7月16日

VIP会员

文章信息

相关主题

相关VIP内容

【斯坦福CS224N硬核课】问答系统，陈丹琦讲解，79页ppt

【斯坦福CS224N硬核课】问答系统，陈丹琦讲解，79页ppt

专知会员服务

74+阅读 · 2021年2月23日

Python编程基础，121页ppt

Python编程基础，121页ppt

专知会员服务

49+阅读 · 2021年1月1日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

【Facebook AI-ICLR2020】神经网络训练早期阶段探究，Early Phase of NN Training

专知会员服务

18+阅读 · 2020年3月3日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基础模型训练中网络规模数据的负责任与高效使用

《俄乌战争背景下俄罗斯的战略性海军分析（2022-2025年）》最新100页报告

人工智能时代背景下的未来海战

相关资讯

ICML2019：Google和Facebook在推进哪些方向？

ICML2019：Google和Facebook在推进哪些方向？

专知

5+阅读 · 2019年6月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

计算机视觉的不同任务

计算机视觉的不同任务

专知

5+阅读 · 2018年8月27日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

Exact Stochastic Second Order Deep Learning

Exact Stochastic Second Order Deep Learning

Arxiv

0+阅读 · 2021年4月8日

Escaping Saddle Points for Nonsmooth Weakly Convex Functions via Perturbed Proximal Algorithms

Arxiv

0+阅读 · 2021年4月8日

Accelerated derivative-free nonlinear least-squares applied to the estimation of Manning coefficients

Arxiv

0+阅读 · 2021年4月6日

The Power of Subsampling in Submodular Maximization

Arxiv

0+阅读 · 2021年4月6日

Complete Dictionary Learning via $\ell^4$-Norm Maximization over the Orthogonal Group

Arxiv

0+阅读 · 2021年4月6日

A Survey on Recent Progress in the Theory of Evolutionary Algorithms for Discrete Optimization

Arxiv

0+阅读 · 2021年4月6日

Non-parametric Quantile Regression via the K-NN Fused Lasso

Arxiv

0+阅读 · 2021年4月6日

A Caputo fractional derivative-based algorithm for optimization

Arxiv

0+阅读 · 2021年4月6日

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Arxiv

3+阅读 · 2018年10月1日

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Arxiv

9+阅读 · 2018年7月16日

微信扫码咨询专知VIP会员