深层学习的渐进式双级优化:调查 (Gradient-based Bi-level Optimization for Deep Learning: A Survey) - 专知论文

会员服务 ·

0

优化器 · Learning · 最优化 · 知识 (knowledge) · 超参数 ·

2023 年 2 月 7 日

Gradient-based Bi-level Optimization for Deep Learning: A Survey

翻译：深层学习的渐进式双级优化:调查

Can Chen,Xi Chen,Chen Ma,Zixuan Liu,Xue Liu

from arxiv, AI4Science; Bi-level Optimization; Hyperparameter Optimization; Meta Learning; Implicit Function

Bi-level optimization, especially the gradient-based category, has been widely used in the deep learning community including hyperparameter optimization and meta knowledge extraction. Bi-level optimization embeds one problem within another and the gradient-based category solves the outer level task by computing the hypergradient, which is much more efficient than classical methods such as the evolutionary algorithm. In this survey, we first give a formal definition of the gradient-based bi-level optimization. Secondly, we illustrate how to formulate a research problem as a bi-level optimization problem, which is of great practical use for beginners. More specifically, there are two formulations: the single-task formulation to optimize hyperparameters such as regularization parameters and the distilled data, and the multi-task formulation to extract meta knowledge such as the model initialization. With a bi-level formulation, we then discuss four bi-level optimization solvers to update the outer variable including explicit gradient update, proxy update, implicit function update, and closed-form update. Last but not least, we conclude the survey by pointing out the great potential of gradient-based bi-level optimization on science problems (AI4Science).

翻译：双级优化,特别是基于梯度的优化,已在深层学习界广泛使用,包括超参数优化和元知识提取。双级优化将一个问题嵌入另一个问题,而基于梯度的分类则通过计算超梯度(这比传统方法,如演化算法效率高得多)来解决外部层面的任务。在本次调查中,我们首先对基于梯度的双级优化作出正式定义。第二,我们说明如何将研究问题发展成双级优化问题,这对初创者非常实用。更具体地说,有两种配方:一是优化超参数的单级配置,如正规化参数和蒸馏数据,二是提取元知识的多级配置,如模型初始化。然后用双级配方,我们讨论四个双级优化解决方案,以更新外部变量,包括明确的梯度更新、代理更新、隐含功能更新和封闭式更新。最后但并非最不重要的一点是,我们通过指出基于梯度的科学问题双级优化的巨大潜力来结束调查。

0

相关内容

优化器

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

155+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

177+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

104+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

1+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

26+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

28+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

17+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

p53转录后多样化修饰在糖尿病心肌纤维化的发生及进展中的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

拓扑晶态绝缘体的薄膜生长及能带调控

国家自然科学基金

0+阅读 · 2014年12月31日

骨髓间充质干细胞旁分泌CTRP3水平影响心肌梗死疗效及机制

国家自然科学基金

0+阅读 · 2014年12月31日

三维非线性磁流体力学的自适应有限元方法

国家自然科学基金

0+阅读 · 2014年12月31日

雄激素受体在膀胱癌进展中对GATA3的调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

气液两相态的蒸发冷却介质绝缘特性与热力学参数的交叉影响及机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

非晶态金属氧化物透明TFT的研究

国家自然科学基金

0+阅读 · 2012年12月31日

快重离子辐照对GaN基超晶格中二维电子气的影响研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin对胰岛β细胞分泌胰岛素和增殖的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于移动网格的局部间断Galerkin有限元方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Large-Scale Deep Learning Optimizations: A Comprehensive Survey

Arxiv

23+阅读 · 2021年11月2日

Explainable Deep Learning: A Field Guide for the Uninitiated

Arxiv

51+阅读 · 2021年9月13日

Graph Self-Supervised Learning: A Survey

Arxiv

14+阅读 · 2021年8月5日

Model Complexity of Deep Learning: A Survey

Arxiv

32+阅读 · 2021年3月8日

Backdoor Learning: A Survey

Arxiv

14+阅读 · 2020年10月26日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

105+阅读 · 2019年12月19日

Few-shot Learning: A Survey

Few-shot Learning: A Survey

Arxiv

362+阅读 · 2019年4月10日

Deep Learning on Graphs: A Survey

Arxiv

53+阅读 · 2018年12月11日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

155+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

177+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

104+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能与天空地一体化网络的相互作用研究综述》61页长综述

《通过地理空间情报管理的战损评估以加速战场决策》31页报告

美国“核指挥、控制和通信（NC3）”最新情况

《俄乌战争：击败陆地部队仍需陆地部队》最新报告

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

1+阅读 · 2022年11月2日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

26+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

28+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

42+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

17+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Large-Scale Deep Learning Optimizations: A Comprehensive Survey

Arxiv

23+阅读 · 2021年11月2日

Explainable Deep Learning: A Field Guide for the Uninitiated

Arxiv

51+阅读 · 2021年9月13日

Graph Self-Supervised Learning: A Survey

Arxiv

14+阅读 · 2021年8月5日

Model Complexity of Deep Learning: A Survey

Arxiv

32+阅读 · 2021年3月8日

Backdoor Learning: A Survey

Arxiv

14+阅读 · 2020年10月26日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

105+阅读 · 2019年12月19日

Few-shot Learning: A Survey

Few-shot Learning: A Survey

Arxiv

362+阅读 · 2019年4月10日

Deep Learning on Graphs: A Survey

Arxiv

53+阅读 · 2018年12月11日

相关基金

p53转录后多样化修饰在糖尿病心肌纤维化的发生及进展中的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

拓扑晶态绝缘体的薄膜生长及能带调控

国家自然科学基金

0+阅读 · 2014年12月31日

骨髓间充质干细胞旁分泌CTRP3水平影响心肌梗死疗效及机制

国家自然科学基金

0+阅读 · 2014年12月31日

三维非线性磁流体力学的自适应有限元方法

国家自然科学基金

0+阅读 · 2014年12月31日

雄激素受体在膀胱癌进展中对GATA3的调控机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

气液两相态的蒸发冷却介质绝缘特性与热力学参数的交叉影响及机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

非晶态金属氧化物透明TFT的研究

国家自然科学基金

0+阅读 · 2012年12月31日

快重离子辐照对GaN基超晶格中二维电子气的影响研究

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin对胰岛β细胞分泌胰岛素和增殖的影响及分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于移动网格的局部间断Galerkin有限元方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员