META-STORM:无约束功能通用全面调整差异减少 SGD (META-STORM: Generalized Fully-Adaptive Variance Reduced SGD for Unbounded Functions) - 专知论文

会员服务 ·

0

泛函 · 方差 · 可约的 · 情景 · Apache Storm ·

2022 年 9 月 29 日

META-STORM: Generalized Fully-Adaptive Variance Reduced SGD for Unbounded Functions

翻译：META-STORM:无约束功能通用全面调整差异减少 SGD

Zijian Liu,Ta Duy Nguyen,Thien Hang Nguyen,Alina Ene,Huy L. Nguyen

We study the application of variance reduction (VR) techniques to general non-convex stochastic optimization problems. In this setting, the recent work STORM [Cutkosky-Orabona '19] overcomes the drawback of having to compute gradients of "mega-batches" that earlier VR methods rely on. There, STORM utilizes recursive momentum to achieve the VR effect and is then later made fully adaptive in STORM+ [Levy et al., '21], where full-adaptivity removes the requirement for obtaining certain problem-specific parameters such as the smoothness of the objective and bounds on the variance and norm of the stochastic gradients in order to set the step size. However, STORM+ crucially relies on the assumption that the function values are bounded, excluding a large class of useful functions. In this work, we propose META-STORM, a generalized framework of STORM+ that removes this bounded function values assumption while still attaining the optimal convergence rate for non-convex optimization. META-STORM not only maintains full-adaptivity, removing the need to obtain problem specific parameters, but also improves the convergence rate's dependency on the problem parameters. Furthermore, META-STORM can utilize a large range of parameter settings that subsumes previous methods allowing for more flexibility in a wider range of settings. Finally, we demonstrate the effectiveness of META-STORM through experiments across common deep learning tasks. Our algorithm improves upon the previous work STORM+ and is competitive with widely used algorithms after the addition of per-coordinate update and exponential moving average heuristics.

翻译：我们研究将差异减少(VR)技术应用于一般的非 convex 随机优化问题。在这种环境下,最近StorM [Cutkosky- Orabona'19] 的工作克服了计算早期VR方法所依赖的“ 超巴格” 梯度的缺陷。在那里,StorM 利用回溯动力来实现 VR 效应,然后在STORM + [Levy et al., '21] 中完全适应。完全适应性消除了获得某些特定问题参数的要求,例如目标的平稳性以及对于测深相梯度差异和规范的界限,以设定步数大小。然而,StorM+ 关键地依赖于函数被捆绑的假设, 不包括大量的有用功能。在这项工作中, 我们提议 META- StorM 的通用框架, 消除了这一约束性函数假设,同时仍然达到非conx 优化的深度优化值。 IMA- Staryloralalal- lax 更新了前的常规范围, IM 也只是保持了前一个特定的缩缩缩。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

外源胆固醇对三疣梭子蟹（Portunus trituberculatus）幼蟹蜕皮的调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

激光脉冲湍流大气瞬态传输机理及对FSO通信的影响研究

国家自然科学基金

0+阅读 · 2012年12月31日

高致病性PRRSV（HuN4株）对仔猪中枢免疫器官的感染及损伤机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

CD4+T细胞亚群失衡在高眼压视神经损伤中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

153Gd-DOTA-Octreotide MR/SPECT单核心双模态小分子探针构建及人肝细胞癌/肺癌裸鼠双瘤模型定量显像研究

国家自然科学基金

0+阅读 · 2012年12月31日

特发性脊柱侧凸的表观遗传学研究

国家自然科学基金

0+阅读 · 2011年12月31日

母体慢性应激对子代注意缺陷多动障碍发生的影响及机制

国家自然科学基金

0+阅读 · 2011年12月31日

斜纹夜蛾的分龄治理的数学描述和模型研究

国家自然科学基金

0+阅读 · 2011年12月31日

Sorcin蛋白在胃癌耐药细胞中的相互作用网络研究

国家自然科学基金

0+阅读 · 2008年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Block Format Error Bounds and Optimal Block Size Selection

Block Format Error Bounds and Optimal Block Size Selection

Arxiv

0+阅读 · 2022年11月7日

Sparse Gaussian Process Hyperparameters: Optimize or Integrate?

Arxiv

0+阅读 · 2022年11月4日

Reweighting the RCT for generalization: finite sample error and variable selection

Arxiv

0+阅读 · 2022年11月4日

Bayesian Sequential Experimental Design for a Partially Linear Model with a Gaussian Process Prior

Arxiv

0+阅读 · 2022年11月4日

Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization

Arxiv

0+阅读 · 2022年11月3日

Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top

Arxiv

0+阅读 · 2022年11月3日

Safety Guarantees for Neural Network Dynamic Systems via Stochastic Barrier Functions

Arxiv

0+阅读 · 2022年11月2日

Large deviations rates for stochastic gradient descent with strongly convex functions

Arxiv

0+阅读 · 2022年11月2日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Block Format Error Bounds and Optimal Block Size Selection

Block Format Error Bounds and Optimal Block Size Selection

Arxiv

0+阅读 · 2022年11月7日

Sparse Gaussian Process Hyperparameters: Optimize or Integrate?

Arxiv

0+阅读 · 2022年11月4日

Reweighting the RCT for generalization: finite sample error and variable selection

Arxiv

0+阅读 · 2022年11月4日

Bayesian Sequential Experimental Design for a Partially Linear Model with a Gaussian Process Prior

Arxiv

0+阅读 · 2022年11月4日

Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization

Arxiv

0+阅读 · 2022年11月3日

Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top

Arxiv

0+阅读 · 2022年11月3日

Safety Guarantees for Neural Network Dynamic Systems via Stochastic Barrier Functions

Arxiv

0+阅读 · 2022年11月2日

Large deviations rates for stochastic gradient descent with strongly convex functions

Arxiv

0+阅读 · 2022年11月2日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Self-correcting Q-Learning

Arxiv

11+阅读 · 2020年12月2日

相关基金

外源胆固醇对三疣梭子蟹（Portunus trituberculatus）幼蟹蜕皮的调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

激光脉冲湍流大气瞬态传输机理及对FSO通信的影响研究

国家自然科学基金

0+阅读 · 2012年12月31日

高致病性PRRSV（HuN4株）对仔猪中枢免疫器官的感染及损伤机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

CD4+T细胞亚群失衡在高眼压视神经损伤中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

153Gd-DOTA-Octreotide MR/SPECT单核心双模态小分子探针构建及人肝细胞癌/肺癌裸鼠双瘤模型定量显像研究

国家自然科学基金

0+阅读 · 2012年12月31日

特发性脊柱侧凸的表观遗传学研究

国家自然科学基金

0+阅读 · 2011年12月31日

母体慢性应激对子代注意缺陷多动障碍发生的影响及机制

国家自然科学基金

0+阅读 · 2011年12月31日

斜纹夜蛾的分龄治理的数学描述和模型研究

国家自然科学基金

0+阅读 · 2011年12月31日

Sorcin蛋白在胃癌耐药细胞中的相互作用网络研究

国家自然科学基金

0+阅读 · 2008年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员