具有近似梯度的超长参数优化 (Hyperparameter optimization with approximate gradient) - 专知论文

会员服务 ·

0

超参数 · 优化器 · 正则化项 · MoDELS · 近似 ·

2022 年 11 月 21 日

Hyperparameter optimization with approximate gradient

翻译：具有近似梯度的超长参数优化

Fabian Pedregosa

from arxiv, Fixes error in proof of Theorem 2

Most models in machine learning contain at least one hyperparameter to control for model complexity. Choosing an appropriate set of hyperparameters is both crucial in terms of model accuracy and computationally challenging. In this work we propose an algorithm for the optimization of continuous hyperparameters using inexact gradient information. An advantage of this method is that hyperparameters can be updated before model parameters have fully converged. We also give sufficient conditions for the global convergence of this method, based on regularity conditions of the involved functions and summability of errors. Finally, we validate the empirical performance of this method on the estimation of regularization constants of L2-regularized logistic regression and kernel Ridge regression. Empirical benchmarks indicate that our approach is highly competitive with respect to state of the art methods.

翻译：机器学习中的大多数模型至少包含一个用于控制模型复杂性的超参数。选择一套适当的超参数对于模型准确性和计算具有挑战性都至关重要。在这项工作中,我们提出了使用不精确的梯度信息优化连续超参数的算法。这种方法的一个优点是,在模型参数完全趋同之前可以更新超参数。我们还根据所涉功能的正常性条件和误差的可比较性,为这一方法的全球趋同提供了充分的条件。最后,我们验证了这一方法在L2正规化物流回归和内核脊脊回归的正规常数估计方面的实验性表现。经验性基准表明,我们的方法对于艺术方法的状况具有高度竞争力。

0

相关内容

超参数

在贝叶斯统计中，超参数是先验分布的参数；该术语用于将它们与所分析的基础系统的模型参数区分开。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

玉米穗粒数和粒重的调控网络解析

国家自然科学基金

0+阅读 · 2016年12月31日

利用贝叶斯方法估计LAMOST恒星参数

国家自然科学基金

2+阅读 · 2015年12月31日

集值优化问题的逼近解及二阶最优性条件

国家自然科学基金

0+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

REGγ在多发性骨髓瘤中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

未来无线通信网络中的随机系统与最优资源-性能控制

国家自然科学基金

0+阅读 · 2012年12月31日

小样本空间制图

国家自然科学基金

0+阅读 · 2012年12月31日

核函数优化选择的关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

CA-rich顺式元件及其相互作用的反式因子对可变剪接的调控机制

国家自然科学基金

0+阅读 · 2009年12月31日

Likelihood-based generalization of Markov parameter estimation and multiple shooting objectives in system identification

Arxiv

0+阅读 · 2023年1月20日

Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization

Arxiv

0+阅读 · 2023年1月19日

A Multi-Resolution Framework for U-Nets with Applications to Hierarchical VAEs

Arxiv

0+阅读 · 2023年1月19日

A Nonstochastic Control Approach to Optimization

Arxiv

0+阅读 · 2023年1月19日

Optimization-based Block Coordinate Gradient Coding for Mitigating Partial Stragglers in Distributed Learning

Arxiv

0+阅读 · 2023年1月18日

Data thinning for convolution-closed distributions

Arxiv

0+阅读 · 2023年1月18日

Noisy, Non-Smooth, Non-Convex Estimation of Moment Condition Models

Arxiv

0+阅读 · 2023年1月17日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Optimization Models for Machine Learning: A Survey

Arxiv

18+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《步兵小单元山地严寒作战指南》美军最新条令200页

《联合作战概念的发展》最新报告

俄制无人机弹药

《复杂场景下自主着陆的模型预测控制技术》92页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Likelihood-based generalization of Markov parameter estimation and multiple shooting objectives in system identification

Arxiv

0+阅读 · 2023年1月20日

Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization

Arxiv

0+阅读 · 2023年1月19日

A Multi-Resolution Framework for U-Nets with Applications to Hierarchical VAEs

Arxiv

0+阅读 · 2023年1月19日

A Nonstochastic Control Approach to Optimization

Arxiv

0+阅读 · 2023年1月19日

Optimization-based Block Coordinate Gradient Coding for Mitigating Partial Stragglers in Distributed Learning

Arxiv

0+阅读 · 2023年1月18日

Data thinning for convolution-closed distributions

Arxiv

0+阅读 · 2023年1月18日

Noisy, Non-Smooth, Non-Convex Estimation of Moment Condition Models

Arxiv

0+阅读 · 2023年1月17日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Optimization Models for Machine Learning: A Survey

Arxiv

18+阅读 · 2019年1月16日

相关基金

玉米穗粒数和粒重的调控网络解析

国家自然科学基金

0+阅读 · 2016年12月31日

利用贝叶斯方法估计LAMOST恒星参数

国家自然科学基金

2+阅读 · 2015年12月31日

集值优化问题的逼近解及二阶最优性条件

国家自然科学基金

0+阅读 · 2014年12月31日

蓖麻矮化相关RcDof基因功能分析及调控机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

REGγ在多发性骨髓瘤中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

未来无线通信网络中的随机系统与最优资源-性能控制

国家自然科学基金

0+阅读 · 2012年12月31日

小样本空间制图

国家自然科学基金

0+阅读 · 2012年12月31日

核函数优化选择的关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

CA-rich顺式元件及其相互作用的反式因子对可变剪接的调控机制

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员