启动的元学习 (Bootstrapped Meta-Learning) - 专知论文

会员服务 ·

0

自助法/自举法 · 可约的 · 元优化 · 元学习器 · MAML ·

2021 年 9 月 9 日

Bootstrapped Meta-Learning

翻译：启动的元学习

Sebastian Flennerhag,Yannick Schroecker,Tom Zahavy,Hado van Hasselt,David Silver,Satinder Singh

from arxiv, 31 pages, 19 figures, 7 tables

Meta-learning empowers artificial intelligence to increase its efficiency by learning how to learn. Unlocking this potential involves overcoming a challenging meta-optimisation problem that often exhibits ill-conditioning, and myopic meta-objectives. We propose an algorithm that tackles these issues by letting the meta-learner teach itself. The algorithm first bootstraps a target from the meta-learner, then optimises the meta-learner by minimising the distance to that target under a chosen (pseudo-)metric. Focusing on meta-learning with gradients, we establish conditions that guarantee performance improvements and show that the improvement is related to the target distance. Thus, by controlling curvature, the distance measure can be used to ease meta-optimization, for instance by reducing ill-conditioning. Further, the bootstrapping mechanism can extend the effective meta-learning horizon without requiring backpropagation through all updates. The algorithm is versatile and easy to implement. We achieve a new state-of-the art for model-free agents on the Atari ALE benchmark, improve upon MAML in few-shot learning, and demonstrate how our approach opens up new possibilities by meta-learning efficient exploration in a Q-learning agent.

翻译：元学习使人工智能能够通过学习学习来提高效率。解锁这一潜力需要克服一个挑战性的元优化问题, 常常表现出不适应和短视的元目标。我们建议一种算法,通过让元脱皮器自学来解决这些问题。算法第一靴套将一个来自元脱皮器的目标设为陷阱, 然后将元脱皮器的距离通过选择的( 假冒) 度量来最小化, 以提高其效率。以梯度为主的元学习为焦点, 我们建立保证业绩改进的条件, 并显示改进与目标距离相关。因此, 通过控制曲线, 远程测量可以用来缓解元脱皮, 例如通过减少不适应性调整。此外, 制靴机制可以扩展有效的元学习视野, 而无需通过所有更新进行反向调整。算法既灵活又容易实施。我们为Atari ALE 基准的无型代理实现了一种新的状态, 改进了在微调的学习中, 在微调的代理中改进了MAL, 展示了我们如何打开新的可能性。

0

相关内容

自助法/自举法

自助法/自举法

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【新书稿：强化学习：理论与算法】《Reinforcement Learning: Theory and Algorithms》by Alekh Agarwal, Nan Jiang, Sham M. Kakade (2019)，(附83页pdf)

【新书稿：强化学习：理论与算法】《Reinforcement Learning: Theory and Algorithms》by Alekh Agarwal, Nan Jiang, Sham M. Kakade (2019)，(附83页pdf)

专知会员服务

79+阅读 · 2019年11月23日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning

Arxiv

1+阅读 · 2021年11月2日

An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning

Arxiv

0+阅读 · 2021年10月27日

Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation

Arxiv

9+阅读 · 2021年6月16日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

Few-shot Learning: A Survey

Few-shot Learning: A Survey

Arxiv

363+阅读 · 2019年4月10日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

Meta-Transfer Learning for Few-Shot Learning

Meta-Transfer Learning for Few-Shot Learning

Arxiv

8+阅读 · 2018年12月6日

Large Margin Few-Shot Learning

Arxiv

11+阅读 · 2018年7月8日

Multiagent Cooperation and Competition with Deep Reinforcement Learning

Arxiv

4+阅读 · 2015年11月27日

VIP会员

文章信息

相关主题

自助法/自举法

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【新书稿：强化学习：理论与算法】《Reinforcement Learning: Theory and Algorithms》by Alekh Agarwal, Nan Jiang, Sham M. Kakade (2019)，(附83页pdf)

【新书稿：强化学习：理论与算法】《Reinforcement Learning: Theory and Algorithms》by Alekh Agarwal, Nan Jiang, Sham M. Kakade (2019)，(附83页pdf)

专知会员服务

79+阅读 · 2019年11月23日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Local Advantage Actor-Critic for Robust Multi-Agent Deep Reinforcement Learning

Arxiv

1+阅读 · 2021年11月2日

An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning

Arxiv

0+阅读 · 2021年10月27日

Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation

Arxiv

9+阅读 · 2021年6月16日

Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Arxiv

8+阅读 · 2020年11月26日

Meta-Learning with Implicit Gradients

Meta-Learning with Implicit Gradients

Arxiv

13+阅读 · 2019年9月10日

Few-shot Learning: A Survey

Few-shot Learning: A Survey

Arxiv

363+阅读 · 2019年4月10日

Generalization and Regularization in DQN

Generalization and Regularization in DQN

Arxiv

6+阅读 · 2019年1月30日

Meta-Transfer Learning for Few-Shot Learning

Meta-Transfer Learning for Few-Shot Learning

Arxiv

8+阅读 · 2018年12月6日

Large Margin Few-Shot Learning

Arxiv

11+阅读 · 2018年7月8日

Multiagent Cooperation and Competition with Deep Reinforcement Learning

Arxiv

4+阅读 · 2015年11月27日

微信扫码咨询专知VIP会员