低级散装散装自由牛顿:可缩放的斯托卡式非电流优化 (Low Rank Saddle Free Newton: Scalable Stochastic Nonconvex Optimization) - 专知论文

会员服务 ·

0

秩 · 非凸 · 优化器 · Extensibility · 子采样 ·

2021 年 7 月 9 日

Low Rank Saddle Free Newton: Scalable Stochastic Nonconvex Optimization

翻译：低级散装散装自由牛顿:可缩放的斯托卡式非电流优化

Thomas O'Leary-Roseberry,Nick Alger,Omar Ghattas

Newton methods have fallen out of favor for modern optimization problems (e.g. deep learning) because of concerns about per-iteration computational complexity. In this setting highly subsampled first order methods are preferred. In this work we motivate the extension of Newton methods to the highly stochastic regime, and argue for the use of the scalable low rank saddle free Newton (LRSFN) method. In this setting, iterative updates are dominated by stochastic noise, and stability of the method is key. In stability analysis, we demonstrate that stochastic errors for Newton methods can be greatly amplified by ill-conditioned matrix operators. The LRSFN algorithm mitigates this issue by the use of Levenberg-Marquardt damping, but generally second order methods with stochastic Hessian and gradient information may need to take small steps, unlike in deterministic problems. Numerical results show that even under restrictive step-length conditions, LRSFN can outperform popular first order methods on nontrivial deep learning tasks in terms of generalizability for equivalent computational work.

翻译：牛顿方法对于现代优化问题(例如深层学习)已经失去优势,因为人们担心按部就班地计算复杂性。在这种设置中,偏好高度分抽样的第一顺序方法。在这项工作中,我们激励牛顿方法推广到高度随机系统,并主张使用可缩放的低级马鞍无牛顿(LRSFN)方法。在这个环境中,迭代更新主要是由随机噪音,而方法的稳定性是关键。在稳定性分析中,我们证明对牛顿方法的随机错误可以通过条件不完善的矩阵操作者大大放大。勒文伯格-马尔夸德特的算法通过使用利文登堡-马尔夸德调控,缓解了这一问题,但一般来说,与偏差的赫斯和梯度信息相比,可能需要采取小步的第二顺序方法。数字结果显示,即使在限制性的步长条件下,新牛顿系统也可以超越在同等计算工作一般可读性方面非深层次学习任务的第一顺序方法。

0

相关内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】算法博弈论，775页pdf，Algorithmic Game Theory

【经典书】算法博弈论，775页pdf，Algorithmic Game Theory

专知会员服务

154+阅读 · 2021年5月9日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【TED】什么让我们生病

【TED】什么让我们生病

英语演讲视频每日一推

7+阅读 · 2019年1月23日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

OpenAI丨深度强化学习关键论文列表

OpenAI丨深度强化学习关键论文列表

中国人工智能学会

17+阅读 · 2018年11月10日

【OpenAI】深度强化学习关键论文列表

【OpenAI】深度强化学习关键论文列表

专知

11+阅读 · 2018年11月10日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Asynchronous Iterations in Optimization: New Sequence Results and Sharper Algorithmic Guarantees

Arxiv

0+阅读 · 2021年9月9日

Combining resampling and reweighting for faithful stochastic optimization

Combining resampling and reweighting for faithful stochastic optimization

Arxiv

0+阅读 · 2021年9月9日

A fast and simple modification of Newton's method helping to avoid saddle points

A fast and simple modification of Newton's method helping to avoid saddle points

Arxiv

0+阅读 · 2021年9月9日

Autonomous learning of nonlocal stochastic neuron dynamics

Arxiv

0+阅读 · 2021年9月7日

Why Do Local Methods Solve Nonconvex Problems?

Arxiv

12+阅读 · 2021年3月24日

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Arxiv

3+阅读 · 2020年6月15日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Arxiv

3+阅读 · 2018年10月1日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

VIP会员

文章信息

相关主题

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】算法博弈论，775页pdf，Algorithmic Game Theory

【经典书】算法博弈论，775页pdf，Algorithmic Game Theory

专知会员服务

154+阅读 · 2021年5月9日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《美陆军特种作战条令》最新102页

《洛克希德SR-71“黑鸟”侦察机动力系统》21页slides

美空军作战实验室通过人工智能和指挥控制技术创新推进杀伤链

《指挥控制能力分析方法论》最新报告

相关资讯

【TED】什么让我们生病

【TED】什么让我们生病

英语演讲视频每日一推

7+阅读 · 2019年1月23日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

OpenAI丨深度强化学习关键论文列表

OpenAI丨深度强化学习关键论文列表

中国人工智能学会

17+阅读 · 2018年11月10日

【OpenAI】深度强化学习关键论文列表

【OpenAI】深度强化学习关键论文列表

专知

11+阅读 · 2018年11月10日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Asynchronous Iterations in Optimization: New Sequence Results and Sharper Algorithmic Guarantees

Arxiv

0+阅读 · 2021年9月9日

Combining resampling and reweighting for faithful stochastic optimization

Combining resampling and reweighting for faithful stochastic optimization

Arxiv

0+阅读 · 2021年9月9日

A fast and simple modification of Newton's method helping to avoid saddle points

A fast and simple modification of Newton's method helping to avoid saddle points

Arxiv

0+阅读 · 2021年9月9日

Autonomous learning of nonlocal stochastic neuron dynamics

Arxiv

0+阅读 · 2021年9月7日

Why Do Local Methods Solve Nonconvex Problems?

Arxiv

12+阅读 · 2021年3月24日

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Arxiv

3+阅读 · 2020年6月15日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Arxiv

3+阅读 · 2018年10月1日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

微信扫码咨询专知VIP会员