在双层优化中有效绕开套接合点 (Efficiently Escaping Saddle Points in Bilevel Optimization) - 专知论文

会员服务 ·

0

鞍点 · 优化器 · 局部极小 · 最优化 · 极小点 ·

2022 年 2 月 8 日

Efficiently Escaping Saddle Points in Bilevel Optimization

翻译：在双层优化中有效绕开套接合点

Minhui Huang,Kaiyi Ji,Shiqian Ma,Lifeng Lai

Bilevel optimization is one of the fundamental problems in machine learning and optimization. Recent theoretical developments in bilevel optimization focus on finding the first-order stationary points for nonconvex-strongly-convex cases. In this paper, we analyze algorithms that can escape saddle points in nonconvex-strongly-convex bilevel optimization. Specifically, we show that the perturbed approximate implicit differentiation (AID) with a warm start strategy finds $\epsilon$-approximate local minimum of bilevel optimization in $\tilde{O}(\epsilon^{-2})$ iterations with high probability. Moreover, we propose an inexact NEgative-curvature-Originated-from-Noise Algorithm (iNEON), a pure first-order algorithm that can escape saddle point and find local minimum of stochastic bilevel optimization. As a by-product, we provide the first nonasymptotic analysis of perturbed multi-step gradient descent ascent (GDmax) algorithm that converges to local minimax point for minimax problems.

翻译：双层优化是机器学习和优化的根本问题之一。最近双层优化的理论发展侧重于为非电流强密凝固型案例找到第一阶固定点。在本文中, 我们分析了在非电流强密凝固双层优化中能够逃离马鞍点的算法。具体地说, 我们显示, 带有温暖启动战略的被扰动的近似隐含差异( AID) 在 $\ tilde{O} (\epsilon}-2} (\ epsilon- 2}) 中, 发现本地双层优化的近似最低值。此外, 我们提出一种不完全的电流- 电流- 离氮化- 亚松立- 亚松亚勒哥里特姆( iNeON) 算法, 这是一种纯粹的第一阶流算法, 可以逃离马鞍点, 并找到本地最小的双层优化。作为副产品, 我们提供了第一个非抽动多级梯位梯系血源( GDmax) 缩算法, 将迷你轴问题集中到本地点。

0

相关内容

在数学中，鞍点或极大极小点是函数图形表面上的一点，其正交方向上的斜率(导数)都为零，但它不是函数的局部极值。鞍点是在某一轴向(峰值之间)有一个相对最小的临界点，在交叉轴上有一个相对最大的临界点。

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【NeurIPS2021】非凸从动件的基于梯度的双层优化

专知会员服务

13+阅读 · 2021年10月12日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

梯度下降（Gradient Descent）的收敛性分析

梯度下降（Gradient Descent）的收敛性分析

PaperWeekly

2+阅读 · 2022年3月10日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

多层时空并行 Schwarz 算法的研究

国家自然科学基金

3+阅读 · 2017年12月31日

大规模参数估计的约束无导数优化信赖域方法

国家自然科学基金

1+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

中温固体氧化物燃料电池LSCF阴极衰减机理及提高稳定性研究

国家自然科学基金

0+阅读 · 2013年12月31日

高维基矩阵下信道极化码设计与译码算法优化研究

国家自然科学基金

0+阅读 · 2013年12月31日

一类Monge-Ampère方程解的边界行为

国家自然科学基金

0+阅读 · 2013年12月31日

Steklov特征值问题的高效数值计算方法

国家自然科学基金

0+阅读 · 2012年12月31日

进化规划算法的计算时间难题研究

国家自然科学基金

0+阅读 · 2010年12月31日

用于生长因子类药物生殖发育毒性评价的小鼠胚胎干细胞特异性分子标记物的筛选

国家自然科学基金

0+阅读 · 2009年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

Faster Perturbed Stochastic Gradient Methods for Finding Local Minima

Arxiv

0+阅读 · 2022年4月20日

Stochastic Saddle Point Problems with Decision-Dependent Distributions

Arxiv

0+阅读 · 2022年4月19日

Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate

Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate

Arxiv

1+阅读 · 2022年4月18日

Risk and optimal policies in bandit experiments

Risk and optimal policies in bandit experiments

Arxiv

0+阅读 · 2022年4月18日

An Efficient Approximation Algorithm for the Colonel Blotto Game

Arxiv

0+阅读 · 2022年4月16日

Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

Arxiv

0+阅读 · 2022年4月16日

On the Convergence of Differentially Private Federated Learning on Non-Lipschitz Objectives, and with Normalized Client Updates

Arxiv

0+阅读 · 2022年4月16日

Computationally Efficient and Statistically Optimal Robust Low-rank Matrix Estimation

Arxiv

0+阅读 · 2022年4月16日

A Statistical Decision-Theoretical Perspective on the Two-Stage Approach to Parameter Estimation

Arxiv

0+阅读 · 2022年4月15日

Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

Arxiv

0+阅读 · 2022年4月14日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【NeurIPS2021】非凸从动件的基于梯度的双层优化

专知会员服务

13+阅读 · 2021年10月12日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

【ICCV 2019 Toturial】Global Optimization for Geometric Understanding with Provable Guarantees（具有可证明保证的几何理解的全局优化）

专知会员服务

18+阅读 · 2019年11月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

梯度下降（Gradient Descent）的收敛性分析

梯度下降（Gradient Descent）的收敛性分析

PaperWeekly

2+阅读 · 2022年3月10日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Faster Perturbed Stochastic Gradient Methods for Finding Local Minima

Arxiv

0+阅读 · 2022年4月20日

Stochastic Saddle Point Problems with Decision-Dependent Distributions

Arxiv

0+阅读 · 2022年4月19日

Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate

Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate

Arxiv

1+阅读 · 2022年4月18日

Risk and optimal policies in bandit experiments

Risk and optimal policies in bandit experiments

Arxiv

0+阅读 · 2022年4月18日

An Efficient Approximation Algorithm for the Colonel Blotto Game

Arxiv

0+阅读 · 2022年4月16日

Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

Arxiv

0+阅读 · 2022年4月16日

On the Convergence of Differentially Private Federated Learning on Non-Lipschitz Objectives, and with Normalized Client Updates

Arxiv

0+阅读 · 2022年4月16日

Computationally Efficient and Statistically Optimal Robust Low-rank Matrix Estimation

Arxiv

0+阅读 · 2022年4月16日

A Statistical Decision-Theoretical Perspective on the Two-Stage Approach to Parameter Estimation

Arxiv

0+阅读 · 2022年4月15日

Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

Arxiv

0+阅读 · 2022年4月14日

相关基金

多层时空并行 Schwarz 算法的研究

国家自然科学基金

3+阅读 · 2017年12月31日

大规模参数估计的约束无导数优化信赖域方法

国家自然科学基金

1+阅读 · 2015年12月31日

套子代数的Hochschild上同调及套的分类

国家自然科学基金

3+阅读 · 2014年12月31日

中温固体氧化物燃料电池LSCF阴极衰减机理及提高稳定性研究

国家自然科学基金

0+阅读 · 2013年12月31日

高维基矩阵下信道极化码设计与译码算法优化研究

国家自然科学基金

0+阅读 · 2013年12月31日

一类Monge-Ampère方程解的边界行为

国家自然科学基金

0+阅读 · 2013年12月31日

Steklov特征值问题的高效数值计算方法

国家自然科学基金

0+阅读 · 2012年12月31日

进化规划算法的计算时间难题研究

国家自然科学基金

0+阅读 · 2010年12月31日

用于生长因子类药物生殖发育毒性评价的小鼠胚胎干细胞特异性分子标记物的筛选

国家自然科学基金

0+阅读 · 2009年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员