具有独立水平趋同率的斯托克多层次构成最佳优化度 (Stochastic Multi-level Composition Optimization Algorithms with Level-Independent Convergence Rates) - 专知论文

会员服务 ·

0

样本复杂度 · 优化器 · 泛函 · 样本 · Oracle ·

2021 年 4 月 21 日

Stochastic Multi-level Composition Optimization Algorithms with Level-Independent Convergence Rates

翻译：具有独立水平趋同率的斯托克多层次构成最佳优化度

Krishnakumar Balasubramanian,Saeed Ghadimi,Anthony Nguyen

from arxiv, Refined the convergence analysis in Section 3 under weaker assumptions

In this paper, we study smooth stochastic multi-level composition optimization problems, where the objective function is a nested composition of $T$ functions. We assume access to noisy evaluations of the functions and their gradients, through a stochastic first-order oracle. For solving this class of problems, we propose two algorithms using moving-average stochastic estimates, and analyze their convergence to an $\epsilon$-stationary point of the problem. We show that the first algorithm, which is a generalization of \cite{GhaRuswan20} to the $T$ level case, can achieve a sample complexity of $\mathcal{O}(1/\epsilon^6)$ by using mini-batches of samples in each iteration. By modifying this algorithm using linearized stochastic estimates of the function values, we improve the sample complexity to $\mathcal{O}(1/\epsilon^4)$. {\color{black}This modification not only removes the requirement of having a mini-batch of samples in each iteration, but also makes the algorithm parameter-free and easy to implement}. To the best of our knowledge, this is the first time that such an online algorithm designed for the (un)constrained multi-level setting, obtains the same sample complexity of the smooth single-level setting, under standard assumptions (unbiasedness and boundedness of the second moments) on the stochastic first-order oracle.

翻译：在本文中, 我们研究平滑的随机多层次构成优化配置问题, 目标函数是 $T$ 函数的嵌套构成。我们假设通过随机第一个阶梯, 对函数及其梯度进行杂音评估。为了解决这一类问题, 我们建议了两种算法, 使用移动平均随机学估计, 并分析它们与 $\ epsilon$- 静止问题点的趋同性点的趋同性。我们显示, 第一个算法, 也就是将\ cite{ GhaRuswan20} 概括成 $T$ 的立案。我们假设的目标函数及其梯度通过在每次迭代中使用迷你杯样本来获得对 $\ mathcal{ O} (1/\\ epsilon_ 6) 的混杂复杂度的抽样评估。我们用线性平均随机估计值的算法将样本复杂性提高到 $\ mathal { O} (1/\ epsllonl) 4) 。相同。这种修改不仅消除了在每个定序的精度的精度的精度的精度的精度要求,, 也是在这种定的精度下的精度下的精度, 。

0

相关内容

样本复杂度

样本复杂度

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【ICML2020】持续图神经网络，Continuous Graph Neural Networks

【ICML2020】持续图神经网络，Continuous Graph Neural Networks

专知会员服务

151+阅读 · 2020年6月28日

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

专知会员服务

69+阅读 · 2020年6月6日

【ICLR2020】图神经网络与图像处理，微分方程，27页ppt

【ICLR2020】图神经网络与图像处理，微分方程，27页ppt

专知会员服务

48+阅读 · 2020年6月6日

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

专知会员服务

28+阅读 · 2020年3月11日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

目标检测中的Consistent Optimization

目标检测中的Consistent Optimization

极市平台

6+阅读 · 2019年4月23日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Compressed Gradient Tracking Methods for Decentralized Optimization with Linear Convergence

Arxiv

0+阅读 · 2021年6月11日

LSOS: Line-search Second-Order Stochastic optimization methods for nonconvex finite sums

Arxiv

0+阅读 · 2021年6月10日

Stability and Convergence of Stochastic Gradient Clipping: Beyond Lipschitz Continuity and Smoothness

Arxiv

0+阅读 · 2021年6月10日

Analysis and Design of Thompson Sampling for Stochastic Partial Monitoring

Arxiv

0+阅读 · 2021年6月10日

Parameter and Feature Selection in Stochastic Linear Bandits

Arxiv

0+阅读 · 2021年6月9日

Minibatch and Momentum Model-based Methods for Stochastic Non-smooth Non-convex Optimization

Arxiv

0+阅读 · 2021年6月6日

An Efficient Algorithm For Generalized Linear Bandit: Online Stochastic Gradient Descent and Thompson Sampling

Arxiv

0+阅读 · 2021年6月5日

Convergence of Gradient Algorithms for Nonconvex C^{1+alpha} Cost Functions

Arxiv

0+阅读 · 2021年6月3日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

VIP会员

文章信息

相关主题

样本复杂度

相关VIP内容

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【ICML2020】持续图神经网络，Continuous Graph Neural Networks

【ICML2020】持续图神经网络，Continuous Graph Neural Networks

专知会员服务

151+阅读 · 2020年6月28日

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

专知会员服务

69+阅读 · 2020年6月6日

【ICLR2020】图神经网络与图像处理，微分方程，27页ppt

【ICLR2020】图神经网络与图像处理，微分方程，27页ppt

专知会员服务

48+阅读 · 2020年6月6日

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

专知会员服务

28+阅读 · 2020年3月11日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型时代的文档智能：综述

蜂窝通信是否是无人机与无人地面战车主宰战场的关键？

文档视觉问答简述

最新新Agent综述！76页327篇论文梳理，北交大桑基韬教授团队发布《迈向模型原生智能体式人工智能的范式转变综述》

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

目标检测中的Consistent Optimization

目标检测中的Consistent Optimization

极市平台

6+阅读 · 2019年4月23日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Compressed Gradient Tracking Methods for Decentralized Optimization with Linear Convergence

Arxiv

0+阅读 · 2021年6月11日

LSOS: Line-search Second-Order Stochastic optimization methods for nonconvex finite sums

Arxiv

0+阅读 · 2021年6月10日

Stability and Convergence of Stochastic Gradient Clipping: Beyond Lipschitz Continuity and Smoothness

Arxiv

0+阅读 · 2021年6月10日

Analysis and Design of Thompson Sampling for Stochastic Partial Monitoring

Arxiv

0+阅读 · 2021年6月10日

Parameter and Feature Selection in Stochastic Linear Bandits

Arxiv

0+阅读 · 2021年6月9日

Minibatch and Momentum Model-based Methods for Stochastic Non-smooth Non-convex Optimization

Arxiv

0+阅读 · 2021年6月6日

An Efficient Algorithm For Generalized Linear Bandit: Online Stochastic Gradient Descent and Thompson Sampling

Arxiv

0+阅读 · 2021年6月5日

Convergence of Gradient Algorithms for Nonconvex C^{1+alpha} Cost Functions

Arxiv

0+阅读 · 2021年6月3日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

微信扫码咨询专知VIP会员