以建模建模推进蒸汽渐长层 (Bolstering Stochastic Gradient Descent with Model Building) - 专知论文

会员服务 ·

0

随机梯度下降 · MoDELS · 线搜索 · tuning · GROUP ·

2023 年 2 月 15 日

Bolstering Stochastic Gradient Descent with Model Building

翻译：以建模建模推进蒸汽渐长层

S. Ilker Birbil,Ozgur Martin,Gonenc Onay,Figen Oztoprak

Stochastic gradient descent method and its variants constitute the core optimization algorithms that achieve good convergence rates for solving machine learning problems. These rates are obtained especially when these algorithms are fine-tuned for the application at hand. Although this tuning process can require large computational costs, recent work has shown that these costs can be reduced by line search methods that iteratively adjust the stepsize. We propose an alternative approach to stochastic line search by using a new algorithm based on forward step model building. This model building step incorporates second-order information that allows adjusting not only the stepsize but also the search direction. Noting that deep learning model parameters come in groups (layers of tensors), our method builds its model and calculates a new step for each parameter group. This novel diagonalization approach makes the selected step lengths adaptive. We provide convergence rate analysis, and experimentally show that the proposed algorithm achieves faster convergence and better generalization in well-known test problems. More precisely, SMB requires less tuning, and shows comparable performance to other adaptive methods.

翻译：电流梯度下降法及其变体构成核心优化算法,这些算法在解决机器学习问题方面达到良好的趋同率。这些算法是特别当这些算法为手头应用进行微调时获得的。虽然这一调制过程需要大量的计算成本,但最近的工作表明,这些成本可以通过对阶梯进行迭接调整的线上搜索方法来降低。我们建议了一种替代方法,通过使用基于前步建模的新算法来进行抽查线搜索。这个建模步骤包含第二阶梯信息,不仅允许调整阶梯化,而且允许调整搜索方向。注意到深层次的学习模型参数来自各组(高压层),我们的方法为每个参数组构建了模型并计算了新的步骤。这种新型的对等化方法使选定的步长具有适应性。我们提供了趋同率分析,并实验性地表明,拟议的算法在众所周知的测试问题中实现了更快的趋同和更加概括化。更精确地说,SMB需要较少的调,并显示与其他适应方法的相似性。

0

相关内容

随机梯度下降

随机梯度下降

随机梯度下降，按照数据生成分布抽取m个样本，通过计算他们梯度的平均值来更新梯度。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

专知会员服务

33+阅读 · 2022年6月13日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

基于优化Schwarz算法的非线性预条件问题

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

应用数学暑期学校（2015）

国家自然科学基金

5+阅读 · 2015年7月12日

非负矩阵张量积保持问题的研究

国家自然科学基金

0+阅读 · 2014年12月31日

NSCs、BMSCs移植治疗锰中毒大鼠多巴胺能神经损伤分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

含第三组元共晶合金定向凝固中生长界面形貌的演化

国家自然科学基金

0+阅读 · 2012年12月31日

CuInS2量子点敏化纳米TiO2太阳电池的界面电子复合机理研究

国家自然科学基金

0+阅读 · 2010年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

Robust adaptive Lasso in high-dimensional logistic regression

Arxiv

0+阅读 · 2023年4月7日

A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization

Arxiv

0+阅读 · 2023年4月7日

Decentralized gradient descent maximization method for composite nonconvex strongly-concave minimax problems

Arxiv

0+阅读 · 2023年4月5日

Data-Driven Control with Inherent Lyapunov Stability

Arxiv

0+阅读 · 2023年4月4日

Self-building Neural Networks

Arxiv

0+阅读 · 2023年4月3日

Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning

Arxiv

0+阅读 · 2023年4月3日

Leveraging Predictive Models for Adaptive Sampling of Spatiotemporal Fluid Processes

Arxiv

0+阅读 · 2023年4月3日

Optimal Algorithms for Decentralized Stochastic Variational Inequalities

Arxiv

0+阅读 · 2023年4月2日

Decentralized Local Stochastic Extra-Gradient for Variational Inequalities

Arxiv

0+阅读 · 2023年4月2日

Fast Convergent Federated Learning with Aggregated Gradients

Arxiv

0+阅读 · 2023年4月1日

VIP会员

文章信息

相关主题

随机梯度下降

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

加速图神经网络推理，121页ppt，普林斯顿大学JAVIER DUARTE主讲

专知会员服务

33+阅读 · 2022年6月13日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Robust adaptive Lasso in high-dimensional logistic regression

Arxiv

0+阅读 · 2023年4月7日

A Dual Approach to Constrained Markov Decision Processes with Entropy Regularization

Arxiv

0+阅读 · 2023年4月7日

Decentralized gradient descent maximization method for composite nonconvex strongly-concave minimax problems

Arxiv

0+阅读 · 2023年4月5日

Data-Driven Control with Inherent Lyapunov Stability

Arxiv

0+阅读 · 2023年4月4日

Self-building Neural Networks

Arxiv

0+阅读 · 2023年4月3日

Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning

Arxiv

0+阅读 · 2023年4月3日

Leveraging Predictive Models for Adaptive Sampling of Spatiotemporal Fluid Processes

Arxiv

0+阅读 · 2023年4月3日

Optimal Algorithms for Decentralized Stochastic Variational Inequalities

Arxiv

0+阅读 · 2023年4月2日

Decentralized Local Stochastic Extra-Gradient for Variational Inequalities

Arxiv

0+阅读 · 2023年4月2日

Fast Convergent Federated Learning with Aggregated Gradients

Arxiv

0+阅读 · 2023年4月1日

相关基金

基于优化Schwarz算法的非线性预条件问题

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

应用数学暑期学校（2015）

国家自然科学基金

5+阅读 · 2015年7月12日

非负矩阵张量积保持问题的研究

国家自然科学基金

0+阅读 · 2014年12月31日

NSCs、BMSCs移植治疗锰中毒大鼠多巴胺能神经损伤分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

含第三组元共晶合金定向凝固中生长界面形貌的演化

国家自然科学基金

0+阅读 · 2012年12月31日

CuInS2量子点敏化纳米TiO2太阳电池的界面电子复合机理研究

国家自然科学基金

0+阅读 · 2010年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员