更严格分析用于处理存储存储问题的交替存储器梯度方法 (Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems) - 专知论文

会员服务 ·

0

SimPLe · 最优化 · 样本复杂度 · 随机梯度下降 · CASES ·

2021 年 6 月 25 日

Tighter Analysis of Alternating Stochastic Gradient Method for Stochastic Nested Problems

翻译：更严格分析用于处理存储存储问题的交替存储器梯度方法

Tianyi Chen,Yuejiao Sun,Wotao Yin

from arxiv, Submitted for publication

Stochastic nested optimization, including stochastic compositional, min-max and bilevel optimization, is gaining popularity in many machine learning applications. While the three problems share the nested structure, existing works often treat them separately, and thus develop problem-specific algorithms and their analyses. Among various exciting developments, simple SGD-type updates (potentially on multiple variables) are still prevalent in solving this class of nested problems, but they are believed to have slower convergence rate compared to that of the non-nested problems. This paper unifies several SGD-type updates for stochastic nested problems into a single SGD approach that we term ALternating Stochastic gradient dEscenT (ALSET) method. By leveraging the hidden smoothness of the problem, this paper presents a tighter analysis of ALSET for stochastic nested problems. Under the new analysis, to achieve an $\epsilon$-stationary point of the nested problem, it requires ${\cal O}(\epsilon^{-2})$ samples. Under certain regularity conditions, applying our results to stochastic compositional, min-max and reinforcement learning problems either improves or matches the best-known sample complexity in the respective cases. Our results explain why simple SGD-type algorithms in stochastic nested problems all work very well in practice without the need for further modifications.

翻译：软巢式优化, 包括随机成份、微调和双级优化, 在许多机器学习应用程序中越来越受欢迎。虽然三个问题共同使用嵌套结构, 现有工程往往分别处理, 从而开发问题特有的算法和分析。在各种令人兴奋的发展动态中, 简单的 SGD 型更新( 可能基于多种变量) 仍然在解决这种嵌套问题方面很普遍, 但据信它们比非嵌套问题的固定点的趋同率要慢。本文将一些用于随机嵌套问题的SGD型更新合并成单一的 SGD 方法, 我们称之为“ 永久变换梯 dEScent( ALSET) ” ( ALSET) 方法。本文通过利用隐藏的问题平滑的算法, 更严格地分析 ALSET 型更新( 可能基于多个变量) 。在新分析中, 要达到 $\ psilon- stattical 点, 需要$_ ocal O} 样样本。在一定的常规条件下, 将我们的结果应用到简单变型的样本变校正的样本中, 的变校正型问题是如何解释。

0

相关内容

SimPLe

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

专知会员服务

69+阅读 · 2020年6月6日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

后渗透利用msf关闭防火墙

后渗透利用msf关闭防火墙

黑白之道

8+阅读 · 2019年8月24日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Msfvenom 常用生成 Payload 命令

Msfvenom 常用生成 Payload 命令

黑白之道

9+阅读 · 2019年2月23日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Bilevel Optimization: Convergence Analysis and Enhanced Design

Bilevel Optimization: Convergence Analysis and Enhanced Design

Arxiv

0+阅读 · 2021年8月27日

Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients

Arxiv

0+阅读 · 2021年8月27日

The stochastic bilevel continuous knapsack problem with uncertain follower's objective

Arxiv

0+阅读 · 2021年8月27日

Lower Bounds and Accelerated Algorithms for Bilevel Optimization

Arxiv

0+阅读 · 2021年8月27日

Adaptive and Universal Algorithms for Variational Inequalities with Optimal Convergence

Arxiv

0+阅读 · 2021年8月26日

Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization

Arxiv

0+阅读 · 2021年8月26日

Applying Semi-Automated Hyperparameter Tuning for Clustering Algorithms

Arxiv

0+阅读 · 2021年8月25日

Low Rank Saddle Free Newton: A Scalable Method for Stochastic Nonconvex Optimization

Arxiv

0+阅读 · 2021年8月24日

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Arxiv

9+阅读 · 2018年7月16日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

VIP会员

文章信息

相关主题

样本复杂度

随机梯度下降

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

专知会员服务

69+阅读 · 2020年6月6日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

后渗透利用msf关闭防火墙

后渗透利用msf关闭防火墙

黑白之道

8+阅读 · 2019年8月24日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Msfvenom 常用生成 Payload 命令

Msfvenom 常用生成 Payload 命令

黑白之道

9+阅读 · 2019年2月23日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

相关论文

Bilevel Optimization: Convergence Analysis and Enhanced Design

Bilevel Optimization: Convergence Analysis and Enhanced Design

Arxiv

0+阅读 · 2021年8月27日

Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients

Arxiv

0+阅读 · 2021年8月27日

The stochastic bilevel continuous knapsack problem with uncertain follower's objective

Arxiv

0+阅读 · 2021年8月27日

Lower Bounds and Accelerated Algorithms for Bilevel Optimization

Arxiv

0+阅读 · 2021年8月27日

Adaptive and Universal Algorithms for Variational Inequalities with Optimal Convergence

Arxiv

0+阅读 · 2021年8月26日

Adaptivity without Compromise: A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic Optimization

Arxiv

0+阅读 · 2021年8月26日

Applying Semi-Automated Hyperparameter Tuning for Clustering Algorithms

Arxiv

0+阅读 · 2021年8月25日

Low Rank Saddle Free Newton: A Scalable Method for Stochastic Nonconvex Optimization

Arxiv

0+阅读 · 2021年8月24日

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Arxiv

9+阅读 · 2018年7月16日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Arxiv

7+阅读 · 2018年6月1日

微信扫码咨询专知VIP会员