Stockchactic 超梯度的趋同属性 (Convergence Properties of Stochastic Hypergradients) - 专知论文

会员服务 ·

0

近似 · 经验风险最小化 · 优化器 · 经验风险 · 均方误差 ·

2021 年 4 月 12 日

Convergence Properties of Stochastic Hypergradients

翻译：Stockchactic 超梯度的趋同属性

Riccardo Grazzi,Massimiliano Pontil,Saverio Salzo

from arxiv, added experiments, a table of notation and some comments. 22 pages

Bilevel optimization problems are receiving increasing attention in machine learning as they provide a natural framework for hyperparameter optimization and meta-learning. A key step to tackle these problems is the efficient computation of the gradient of the upper-level objective (hypergradient). In this work, we study stochastic approximation schemes for the hypergradient, which are important when the lower-level problem is empirical risk minimization on a large dataset. The method that we propose is a stochastic variant of the approximate implicit differentiation approach in (Pedregosa, 2016). We provide bounds for the mean square error of the hypergradient approximation, under the assumption that the lower-level problem is accessible only through a stochastic mapping which is a contraction in expectation. In particular, our main bound is agnostic to the choice of the two stochastic solvers employed by the procedure. We provide numerical experiments to support our theoretical analysis and to show the advantage of using stochastic hypergradients in practice.

翻译：双层优化问题在机器学习中日益受到重视,因为它们为超参数优化和元化学习提供了一个自然框架。解决这些问题的一个关键步骤是高效计算高层次目标(高度)的梯度。在这项工作中,我们研究高梯度的随机近似方案,当较低层次的问题是大型数据集的实验风险最小化时,这种方案很重要。我们建议的方法是近似隐含差异法的随机变体(Pedregosa,2016年)。我们提供了超梯度近似的平均方形错误的界限,前提是只有通过预期收缩的随机测图才能接近较低层次的问题。特别是,我们的主要界限对选择程序使用的两种随机求解器具有不可知性。我们提供数字实验来支持我们的理论分析,并展示在实践中使用超梯度高梯度的优势。

0

相关内容

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

斯坦福最新《强化学习》2021课程，Emma Brunskill主讲，附PPT下载

斯坦福最新《强化学习》2021课程，Emma Brunskill主讲，附PPT下载

专知会员服务

76+阅读 · 2021年1月23日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

专知会员服务

158+阅读 · 2020年1月29日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

《应用随机微分方程》(Applied Stochastic Differential Equations)324页pdf新书分享

《应用随机微分方程》(Applied Stochastic Differential Equations)324页pdf新书分享

专知会员服务

44+阅读 · 2019年10月28日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

【论文推荐】最新6篇图像分割相关论文—隐马尔可夫随机场、级联三维全卷积、信号处理、全卷积网络、多源域适应、循环分割

【论文推荐】最新6篇图像分割相关论文—隐马尔可夫随机场、级联三维全卷积、信号处理、全卷积网络、多源域适应、循环分割

专知

9+阅读 · 2018年3月21日

【代码+论文】最全LSTM在量化交易中的应用汇总（第五期免费赠书活动来啦！）

【代码+论文】最全LSTM在量化交易中的应用汇总（第五期免费赠书活动来啦！）

量化投资与机器学习

7+阅读 · 2017年11月22日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Approximation Algorithms for Sparse Principal Component Analysis

Approximation Algorithms for Sparse Principal Component Analysis

Arxiv

0+阅读 · 2021年6月4日

Maximal Spaces for Approximation Rates in $\ell^1$-regularization

Arxiv

0+阅读 · 2021年6月4日

A new framework for polynomial approximation to differential equations

Arxiv

0+阅读 · 2021年6月3日

Improved Rates for Differentially Private Stochastic Convex Optimization with Heavy-Tailed Data

Arxiv

0+阅读 · 2021年6月3日

Smooth Bilevel Programming for Sparse Regularization

Arxiv

0+阅读 · 2021年6月2日

Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize

Arxiv

0+阅读 · 2021年6月2日

General Bayesian Loss Function Selection and the use of Improper Models

Arxiv

0+阅读 · 2021年6月2日

Convergence and Optimal Complexity of the Adaptive Planewave Method for Eigenvalue Computations

Arxiv

0+阅读 · 2021年6月2日

Stochastic Optimization of Areas Under Precision-Recall Curves with Provable Convergence

Arxiv

0+阅读 · 2021年6月2日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

VIP会员

文章信息

相关主题

经验风险最小化

相关VIP内容

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

16+阅读 · 2021年5月21日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

斯坦福最新《强化学习》2021课程，Emma Brunskill主讲，附PPT下载

斯坦福最新《强化学习》2021课程，Emma Brunskill主讲，附PPT下载

专知会员服务

76+阅读 · 2021年1月23日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

【经典书】应用随机微分方程，324页pdf，Applied Stochastic Differential Equations

专知会员服务

58+阅读 · 2020年11月21日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

经典书《斯坦福大学-多智能体系统》532页pdf，MULTIAGENT SYSTEMS Algorithmic, Game-Theoretic, and Logical Foundations

专知会员服务

158+阅读 · 2020年1月29日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

《应用随机微分方程》(Applied Stochastic Differential Equations)324页pdf新书分享

《应用随机微分方程》(Applied Stochastic Differential Equations)324页pdf新书分享

专知会员服务

44+阅读 · 2019年10月28日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

TCN v2 + 3Dconv 运动信息

TCN v2 + 3Dconv 运动信息

CreateAMind

4+阅读 · 2019年1月8日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

【论文推荐】最新6篇图像分割相关论文—隐马尔可夫随机场、级联三维全卷积、信号处理、全卷积网络、多源域适应、循环分割

【论文推荐】最新6篇图像分割相关论文—隐马尔可夫随机场、级联三维全卷积、信号处理、全卷积网络、多源域适应、循环分割

专知

9+阅读 · 2018年3月21日

【代码+论文】最全LSTM在量化交易中的应用汇总（第五期免费赠书活动来啦！）

【代码+论文】最全LSTM在量化交易中的应用汇总（第五期免费赠书活动来啦！）

量化投资与机器学习

7+阅读 · 2017年11月22日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】免费书(草稿)：数据科学的数学基础

【推荐】免费书(草稿)：数据科学的数学基础

机器学习研究会

20+阅读 · 2017年10月1日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Approximation Algorithms for Sparse Principal Component Analysis

Approximation Algorithms for Sparse Principal Component Analysis

Arxiv

0+阅读 · 2021年6月4日

Maximal Spaces for Approximation Rates in $\ell^1$-regularization

Arxiv

0+阅读 · 2021年6月4日

A new framework for polynomial approximation to differential equations

Arxiv

0+阅读 · 2021年6月3日

Improved Rates for Differentially Private Stochastic Convex Optimization with Heavy-Tailed Data

Arxiv

0+阅读 · 2021年6月3日

Smooth Bilevel Programming for Sparse Regularization

Arxiv

0+阅读 · 2021年6月2日

Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize

Arxiv

0+阅读 · 2021年6月2日

General Bayesian Loss Function Selection and the use of Improper Models

Arxiv

0+阅读 · 2021年6月2日

Convergence and Optimal Complexity of the Adaptive Planewave Method for Eigenvalue Computations

Arxiv

0+阅读 · 2021年6月2日

Stochastic Optimization of Areas Under Precision-Recall Curves with Provable Convergence

Arxiv

0+阅读 · 2021年6月2日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

微信扫码咨询专知VIP会员