环境强化学习自编背景评价 (Self-Paced Context Evaluation for Contextual Reinforcement Learning) - 专知论文

会员服务 ·

0

学成 · 泛化理论 · 强化学习 · INFORMS · Performer ·

2021 年 6 月 9 日

Self-Paced Context Evaluation for Contextual Reinforcement Learning

翻译：环境强化学习自编背景评价

Theresa Eimer,André Biedenkapp,Frank Hutter,Marius Lindauer

Reinforcement learning (RL) has made a lot of advances for solving a single problem in a given environment; but learning policies that generalize to unseen variations of a problem remains challenging. To improve sample efficiency for learning on such instances of a problem domain, we present Self-Paced Context Evaluation (SPaCE). Based on self-paced learning, \spc automatically generates \task curricula online with little computational overhead. To this end, SPaCE leverages information contained in state values during training to accelerate and improve training performance as well as generalization capabilities to new instances from the same problem domain. Nevertheless, SPaCE is independent of the problem domain at hand and can be applied on top of any RL agent with state-value function approximation. We demonstrate SPaCE's ability to speed up learning of different value-based RL agents on two environments, showing better generalization capabilities and up to 10x faster learning compared to naive approaches such as round robin or SPDRL, as the closest state-of-the-art approach.

翻译：强化学习(RL)在解决特定环境中的单一问题方面取得了许多进步;但是,学习政策一般化为一个问题的无形变异仍然具有挑战性。为了提高在这类问题领域学习的样本效率,我们提出“自略背景评估 ” (SPaCE ) 。基于自定进度的学习,\spc 自动在网上生成 task 课程,而很少计算间接费用。为此,SPaCE 利用培训期间国家价值观中的信息,加快和改进培训业绩,提高普及能力,从同一问题领域到新情况。然而,SPaCE 独立于问题领域,并且可以在任何具有国家价值功能接近的RL 代理的顶部应用。我们展示了SPaCE 在两种环境中加速学习不同价值的RL 代理的能力,展示了更好的普及能力和多达10x的更快的学习速度,而最接近于天真的方法,例如圆轮抢劫或SPDRL,作为最先进的状态方法。

0

相关内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Attend2Pack: Bin Packing through Deep Reinforcement Learning with Attention

Arxiv

0+阅读 · 2021年8月2日

Generative Adversarial Active Learning for Unsupervised Outlier Detection

Generative Adversarial Active Learning for Unsupervised Outlier Detection

Arxiv

5+阅读 · 2019年3月14日

Risk-Aware Active Inverse Reinforcement Learning

Risk-Aware Active Inverse Reinforcement Learning

Arxiv

8+阅读 · 2019年1月8日

Multi-task Deep Reinforcement Learning with PopArt

Multi-task Deep Reinforcement Learning with PopArt

Arxiv

4+阅读 · 2018年9月12日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年4月22日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

Cache-Enabled Dynamic Rate Allocation via Deep Self-Transfer Reinforcement Learning

Arxiv

4+阅读 · 2018年3月30日

Stack-Captioning: Coarse-to-Fine Learning for Image Captioning

Arxiv

6+阅读 · 2018年3月14日

Self Paced Deep Learning for Weakly Supervised Object Detection

Arxiv

8+阅读 · 2018年2月21日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

可解释强化学习，Explainable Reinforcement Learning: A Survey

可解释强化学习，Explainable Reinforcement Learning: A Survey

专知会员服务

131+阅读 · 2020年5月14日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

新质生成式AI赋能产业变革的实践与路径

用于多模态大模型的离散标记化：全面综述

Nature综述：金融网络中的物理学

【CMU博士论文】通信高效且差分隐私的优化方法

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Attend2Pack: Bin Packing through Deep Reinforcement Learning with Attention

Arxiv

0+阅读 · 2021年8月2日

Generative Adversarial Active Learning for Unsupervised Outlier Detection

Generative Adversarial Active Learning for Unsupervised Outlier Detection

Arxiv

5+阅读 · 2019年3月14日

Risk-Aware Active Inverse Reinforcement Learning

Risk-Aware Active Inverse Reinforcement Learning

Arxiv

8+阅读 · 2019年1月8日

Multi-task Deep Reinforcement Learning with PopArt

Multi-task Deep Reinforcement Learning with PopArt

Arxiv

4+阅读 · 2018年9月12日

Reinforcement Learning for Solving the Vehicle Routing Problem

Arxiv

3+阅读 · 2018年5月21日

Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年4月22日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

Cache-Enabled Dynamic Rate Allocation via Deep Self-Transfer Reinforcement Learning

Arxiv

4+阅读 · 2018年3月30日

Stack-Captioning: Coarse-to-Fine Learning for Image Captioning

Arxiv

6+阅读 · 2018年3月14日

Self Paced Deep Learning for Weakly Supervised Object Detection

Arxiv

8+阅读 · 2018年2月21日

微信扫码咨询专知VIP会员