与移动目标同步步进 (Stochastic Polyak Stepsize with a Moving Target) - 专知论文

会员服务 ·

0

数据点 · 方差减小 · 可约的 · 方差 · SGD ·

2021 年 6 月 22 日

Stochastic Polyak Stepsize with a Moving Target

翻译：与移动目标同步步进

Robert M. Gower,Aaron Defazio,Michael Rabbat

from arxiv, 41 pages, 13 figures, 1 table

We propose a new stochastic gradient method that uses recorded past loss values to reduce the variance. Our method can be interpreted as a new stochastic variant of the Polyak Stepsize that converges globally without assuming interpolation. Our method introduces auxiliary variables, one for each data point, that track the loss value for each data point. We provide a global convergence theory for our method by showing that it can be interpreted as a special variant of online SGD. The new method only stores a single scalar per data point, opening up new applications for variance reduction where memory is the bottleneck.

翻译：我们提出了一个新的随机梯度方法,使用过去记录的损失值来减少差异。我们的方法可以被解释为一种新的多功能梯度变体,在不假定内推的情况下,将这种变体聚集到全球。我们的方法引入了辅助变量,每个数据点各一个,跟踪每个数据点的损失值。我们为我们的方法提供了一种全球趋同理论,表明它可以被解释为在线 SGD 的特殊变体。新方法只存储了每个数据点的单标值,在内存为瓶颈的地方打开了减少差异的新应用程序。

0

相关内容

数据点

【ICML2021】二值化网络（BNN）训练与优化

专知会员服务

15+阅读 · 2021年7月24日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

时间序列预测方法综述

专知会员服务

234+阅读 · 2020年12月15日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【实用书】Python编程与解决问题，424页pdf，PROGRAMMING AND PROBLEM SOLVING WITH PYTHON

【实用书】Python编程与解决问题，424页pdf，PROGRAMMING AND PROBLEM SOLVING WITH PYTHON

专知会员服务

76+阅读 · 2020年7月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

ZeroSARAH: Efficient Nonconvex Finite-Sum Optimization with Zero Full Gradient Computation

Arxiv

0+阅读 · 2021年8月23日

STL Robustness Risk over Discrete-Time Stochastic Processes

Arxiv

0+阅读 · 2021年8月22日

Learning and Optimization with Seasonal Patterns

Arxiv

0+阅读 · 2021年8月22日

Self-Directed Online Machine Learning for Topology Optimization

Arxiv

0+阅读 · 2021年8月22日

Abduction of trap invariants in parameterized systems

Arxiv

0+阅读 · 2021年8月20日

Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors

Arxiv

0+阅读 · 2021年8月20日

Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization

Arxiv

0+阅读 · 2021年8月20日

ADMM-based Networked Stochastic Variational Inference

Arxiv

3+阅读 · 2018年2月27日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

相关VIP内容

【ICML2021】二值化网络（BNN）训练与优化

专知会员服务

15+阅读 · 2021年7月24日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

时间序列预测方法综述

专知会员服务

234+阅读 · 2020年12月15日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【实用书】Python编程与解决问题，424页pdf，PROGRAMMING AND PROBLEM SOLVING WITH PYTHON

【实用书】Python编程与解决问题，424页pdf，PROGRAMMING AND PROBLEM SOLVING WITH PYTHON

专知会员服务

76+阅读 · 2020年7月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

相关论文

ZeroSARAH: Efficient Nonconvex Finite-Sum Optimization with Zero Full Gradient Computation

Arxiv

0+阅读 · 2021年8月23日

STL Robustness Risk over Discrete-Time Stochastic Processes

Arxiv

0+阅读 · 2021年8月22日

Learning and Optimization with Seasonal Patterns

Arxiv

0+阅读 · 2021年8月22日

Self-Directed Online Machine Learning for Topology Optimization

Arxiv

0+阅读 · 2021年8月22日

Abduction of trap invariants in parameterized systems

Arxiv

0+阅读 · 2021年8月20日

Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors

Arxiv

0+阅读 · 2021年8月20日

Apollo: An Adaptive Parameter-wise Diagonal Quasi-Newton Method for Nonconvex Stochastic Optimization

Arxiv

0+阅读 · 2021年8月20日

ADMM-based Networked Stochastic Variational Inference

Arxiv

3+阅读 · 2018年2月27日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员