贬低第一等级优于近近两级最佳优化的优度 (Debiasing a First-order Heuristic for Approximate Bi-level Optimization) - 专知论文

会员服务 ·

0

优化器 · 近似 · 二阶导数 · 驻点 · 可理解性 ·

2021 年 6 月 4 日

Debiasing a First-order Heuristic for Approximate Bi-level Optimization

翻译：贬低第一等级优于近近两级最佳优化的优度

Valerii Likhosherstov,Xingyou Song,Krzysztof Choromanski,Jared Davis,Adrian Weller

from arxiv, Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021. arXiv admin note: text overlap with arXiv:2006.03631

Approximate bi-level optimization (ABLO) consists of (outer-level) optimization problems, involving numerical (inner-level) optimization loops. While ABLO has many applications across deep learning, it suffers from time and memory complexity proportional to the length $r$ of its inner optimization loop. To address this complexity, an earlier first-order method (FOM) was proposed as a heuristic that omits second derivative terms, yielding significant speed gains and requiring only constant memory. Despite FOM's popularity, there is a lack of theoretical understanding of its convergence properties. We contribute by theoretically characterizing FOM's gradient bias under mild assumptions. We further demonstrate a rich family of examples where FOM-based SGD does not converge to a stationary point of the ABLO objective. We address this concern by proposing an unbiased FOM (UFOM) enjoying constant memory complexity as a function of $r$. We characterize the introduced time-variance tradeoff, demonstrate convergence bounds, and find an optimal UFOM for a given ABLO problem. Finally, we propose an efficient adaptive UFOM scheme.

翻译：近似双级优化(ABLO)由(外)优化问题组成,涉及数字(内)优化循环。虽然ABLO在深层学习中有许多应用,但它有时间和记忆的复杂性,与其内部优化循环的长度成正比。为解决这一复杂问题,建议采用较早的一级优化方法(FOM)作为一种杂交法,省略第二衍生术语,产生显著的速度增益,并只需要不断记忆。尽管FOM受到欢迎,但对其趋同特性缺乏理论上的理解。我们在轻度假设下对FOM的梯度偏差作了理论上的定性。我们进一步展示了基于FOM的SGD没有与ABLO目标的固定点趋同的大量实例。我们提出一个无偏重的FOM(UOM),将保持恒定的记忆复杂性作为美元函数,以解决这一关切。我们提出了一种不带偏见的FOMM(UFOM),将长期记忆的复杂性视为一种函数。我们描述引入的时间差交易的特点,展示汇合关系,并为一个特定ABLO问题找到一个最佳的UM。我们提议一个高效的适应性UOM计划。

0

相关内容

优化器

【Cell】神经算法推理，Neural algorithmic reasoning

【Cell】神经算法推理，Neural algorithmic reasoning

专知会员服务

29+阅读 · 2021年7月16日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【斯坦福大学AAAI2020】跨越因果层次的概率推理，Probabilistic Reasoning across the Causal Hierarchy

【斯坦福大学AAAI2020】跨越因果层次的概率推理，Probabilistic Reasoning across the Causal Hierarchy

专知会员服务

46+阅读 · 2020年1月11日

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

48+阅读 · 2019年12月19日

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

专知会员服务

16+阅读 · 2019年11月30日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

学术报告|港科大助理教授宋阳秋博士

学术报告|港科大助理教授宋阳秋博士

科技创新与创业

7+阅读 · 2019年7月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Proximal boosting and variants

Arxiv

0+阅读 · 2021年7月27日

Solving for best linear approximates

Arxiv

0+阅读 · 2021年7月27日

Resource Optimization with Interference Coupling in Multi-IRS-assisted Multi-cell Systems

Arxiv

0+阅读 · 2021年7月27日

Debiasing In-Sample Policy Performance for Small-Data, Large-Scale Optimization

Arxiv

0+阅读 · 2021年7月26日

Near-Optimal Algorithms for Minimax Optimization

Arxiv

0+阅读 · 2021年7月26日

Approximation Theory Based Methods for RKHS Bandits

Arxiv

0+阅读 · 2021年7月26日

Near-Optimal Average-Case Approximate Trace Reconstruction from Few Traces

Arxiv

0+阅读 · 2021年7月24日

Zeroth-Order Regularized Optimization (ZORO): Approximately Sparse Gradients and Adaptive Sampling

Arxiv

0+阅读 · 2021年7月23日

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Arxiv

3+阅读 · 2020年6月15日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

相关VIP内容

【Cell】神经算法推理，Neural algorithmic reasoning

【Cell】神经算法推理，Neural algorithmic reasoning

专知会员服务

29+阅读 · 2021年7月16日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

112+阅读 · 2020年5月15日

【斯坦福大学AAAI2020】跨越因果层次的概率推理，Probabilistic Reasoning across the Causal Hierarchy

【斯坦福大学AAAI2020】跨越因果层次的概率推理，Probabilistic Reasoning across the Causal Hierarchy

专知会员服务

46+阅读 · 2020年1月11日

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

【斯坦福大学】面向机器学习的概率和统计要点速览(中文版)《CS 229 - Probabilities and Statistics refresher》by Afshine Amidi, Shervine Amidi

专知会员服务

48+阅读 · 2019年12月19日

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

专知会员服务

16+阅读 · 2019年11月30日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《城市滨海地区：理解复杂多变环境下的指挥控制框架》50页报告

《理解城市战及其在俄乌战争中的表现》报告

美空军“顶点2025”实验：推进AI在C2、动态目标锁定与联盟集成中的应用

《建设式兵棋模拟作为战术集群配置优化的关键组成部分》

相关资讯

学术报告|港科大助理教授宋阳秋博士

学术报告|港科大助理教授宋阳秋博士

科技创新与创业

7+阅读 · 2019年7月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Proximal boosting and variants

Arxiv

0+阅读 · 2021年7月27日

Solving for best linear approximates

Arxiv

0+阅读 · 2021年7月27日

Resource Optimization with Interference Coupling in Multi-IRS-assisted Multi-cell Systems

Arxiv

0+阅读 · 2021年7月27日

Debiasing In-Sample Policy Performance for Small-Data, Large-Scale Optimization

Arxiv

0+阅读 · 2021年7月26日

Near-Optimal Algorithms for Minimax Optimization

Arxiv

0+阅读 · 2021年7月26日

Approximation Theory Based Methods for RKHS Bandits

Arxiv

0+阅读 · 2021年7月26日

Near-Optimal Average-Case Approximate Trace Reconstruction from Few Traces

Arxiv

0+阅读 · 2021年7月24日

Zeroth-Order Regularized Optimization (ZORO): Approximately Sparse Gradients and Adaptive Sampling

Arxiv

0+阅读 · 2021年7月23日

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Arxiv

3+阅读 · 2020年6月15日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员