利用以人口为主的强盗优化在线超光度计 (Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits) - 专知论文

会员服务 ·

0

PBT · 超参数 · 赌博机/老虎机 · Performer · 优化器 ·

2020 年 10 月 14 日

Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits

翻译：利用以人口为主的强盗优化在线超光度计

Jack Parker-Holder,Vu Nguyen,Stephen Roberts

from arxiv, Camera-ready version, NeurIPS 2020

Many of the recent triumphs in machine learning are dependent on well-tuned hyperparameters. This is particularly prominent in reinforcement learning (RL) where a small change in the configuration can lead to failure. Despite the importance of tuning hyperparameters, it remains expensive and is often done in a naive and laborious way. A recent solution to this problem is Population Based Training (PBT) which updates both weights and hyperparameters in a single training run of a population of agents. PBT has been shown to be particularly effective in RL, leading to widespread use in the field. However, PBT lacks theoretical guarantees since it relies on random heuristics to explore the hyperparameter space. This inefficiency means it typically requires vast computational resources, which is prohibitive for many small and medium sized labs. In this work, we introduce the first provably efficient PBT-style algorithm, Population-Based Bandits (PB2). PB2 uses a probabilistic model to guide the search in an efficient way, making it possible to discover high performing hyperparameter configurations with far fewer agents than typically required by PBT. We show in a series of RL experiments that PB2 is able to achieve high performance with a modest computational budget.

翻译：近来在机器学习方面的许多成功都依赖于经过良好调整的超参数。这在强化学习(RL)中特别突出,因为对配置进行小的改变可能导致失败。尽管调整超参数很重要,但费用仍然昂贵,而且往往以幼稚和艰苦的方式完成。这个问题的最近解决办法是人口培训(PBT),它更新了一个代理人员单一培训的重量和超参数。PBT已证明在RL中特别有效,导致外地的广泛使用。然而,PBT缺乏理论保障,因为它依靠随机的超参数空间探索。这种效率低通常意味着它需要大量的计算资源,而许多中小型实验室则无法使用这种资源。在这项工作中,我们采用了第一个效率很高的PBT型算法(PPB2)。 PB2使用一种概率模型来有效指导搜索,从而能够发现高性超参数配置,而比通常高的PB2级测试要低得多。我们用一个普通的PB2级测试来显示一个普通的PLT。

0

相关内容

PBT

【NeurIPS2020】图网的主邻域聚合

【NeurIPS2020】图网的主邻域聚合

专知会员服务

33+阅读 · 2020年9月27日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【斯坦福】凸优化圣经- Convex Optimization （附730pdf下载）

【斯坦福】凸优化圣经- Convex Optimization （附730pdf下载）

专知会员服务

229+阅读 · 2020年6月5日

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

专知会员服务

55+阅读 · 2020年5月26日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

专知会员服务

57+阅读 · 2020年3月13日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

专知

17+阅读 · 2018年6月16日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Arxiv

0+阅读 · 2020年12月1日

The NetHack Learning Environment

Arxiv

0+阅读 · 2020年12月1日

Online learning-based Model Predictive Control with Gaussian Process Models and Stability Guarantees

Arxiv

0+阅读 · 2020年12月1日

Piecewise-Stationary Off-Policy Optimization

Arxiv

0+阅读 · 2020年12月1日

Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning

Arxiv

0+阅读 · 2020年12月1日

Variance based sensitivity analysis for Monte Carlo and importance sampling reliability assessment with Gaussian processes

Arxiv

0+阅读 · 2020年11月30日

A Kaczmarz Method with Simple Random Sampling for Solving Large Linear Systems

Arxiv

0+阅读 · 2020年11月30日

A Population-based Hybrid Approach to Hyperparameter Optimization for Neural Networks

Arxiv

0+阅读 · 2020年11月27日

Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial Optimization

Arxiv

4+阅读 · 2020年2月14日

Mean Field Multi-Agent Reinforcement Learning

Arxiv

5+阅读 · 2018年6月12日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

【NeurIPS2020】图网的主邻域聚合

【NeurIPS2020】图网的主邻域聚合

专知会员服务

33+阅读 · 2020年9月27日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【斯坦福】凸优化圣经- Convex Optimization （附730pdf下载）

【斯坦福】凸优化圣经- Convex Optimization （附730pdf下载）

专知会员服务

229+阅读 · 2020年6月5日

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

专知会员服务

55+阅读 · 2020年5月26日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

【综述】超参数优化:算法和应用综述，Hyper-Parameter Optimization: A Review of Algorithms and Applications

专知会员服务

57+阅读 · 2020年3月13日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

新书册《几何深度学习的数学基础》

中程单向攻击无人机的战略意义：俄乌战争启示

在无标注条件下适配视觉—语言模型：全面综述

面向视觉语言模型的持续学习：遗忘之外的综述与分类体系

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

专知

17+阅读 · 2018年6月16日

语音顶级会议Interspeech2018接受论文列表！

语音顶级会议Interspeech2018接受论文列表！

专知

6+阅读 · 2018年6月10日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Arxiv

0+阅读 · 2020年12月1日

The NetHack Learning Environment

Arxiv

0+阅读 · 2020年12月1日

Online learning-based Model Predictive Control with Gaussian Process Models and Stability Guarantees

Arxiv

0+阅读 · 2020年12月1日

Piecewise-Stationary Off-Policy Optimization

Arxiv

0+阅读 · 2020年12月1日

Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning

Arxiv

0+阅读 · 2020年12月1日

Variance based sensitivity analysis for Monte Carlo and importance sampling reliability assessment with Gaussian processes

Arxiv

0+阅读 · 2020年11月30日

A Kaczmarz Method with Simple Random Sampling for Solving Large Linear Systems

Arxiv

0+阅读 · 2020年11月30日

A Population-based Hybrid Approach to Hyperparameter Optimization for Neural Networks

Arxiv

0+阅读 · 2020年11月27日

Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial Optimization

Arxiv

4+阅读 · 2020年2月14日

Mean Field Multi-Agent Reinforcement Learning

Arxiv

5+阅读 · 2018年6月12日

微信扫码咨询专知VIP会员