AMIGO学习:反向激励内在目标 (Learning with AMIGo: Adversarially Motivated Intrinsic Goals) - 专知论文

会员服务 ·

0

学成 · 回合 · state-of-the-art · contrastive · SimPLe ·

2021 年 2 月 23 日

Learning with AMIGo: Adversarially Motivated Intrinsic Goals

翻译：AMIGO学习:反向激励内在目标

Andres Campero,Roberta Raileanu,Heinrich Küttler,Joshua B. Tenenbaum,Tim Rocktäschel,Edward Grefenstette

from arxiv, 18 pages, 6 figures, published at The Ninth International Conference on Learning Representations (2021)

A key challenge for reinforcement learning (RL) consists of learning in environments with sparse extrinsic rewards. In contrast to current RL methods, humans are able to learn new skills with little or no reward by using various forms of intrinsic motivation. We propose AMIGo, a novel agent incorporating -- as form of meta-learning -- a goal-generating teacher that proposes Adversarially Motivated Intrinsic Goals to train a goal-conditioned "student" policy in the absence of (or alongside) environment reward. Specifically, through a simple but effective "constructively adversarial" objective, the teacher learns to propose increasingly challenging -- yet achievable -- goals that allow the student to learn general skills for acting in a new environment, independent of the task to be solved. We show that our method generates a natural curriculum of self-proposed goals which ultimately allows the agent to solve challenging procedurally-generated tasks where other forms of intrinsic motivation and state-of-the-art RL methods fail.

翻译：强化学习(RL)的关键挑战在于在缺乏外部奖励的环境中学习。与目前的RL方法相反,人类能够通过使用各种形式的内在动机来学习新技能,但很少或根本没有奖励。我们提议AMIGO,这是一个新颖的代理机构,作为元学习的形式,纳入一个目标产生型教师,提出反动的内在目标,以便在没有(或同时)环境奖励的情况下,培训一个有目标条件的“学生”政策。具体地说,通过一个简单而有效的“积极对抗性”目标,教师学会提出越来越具有挑战性 -- -- 但却可以实现 -- -- 的目标,使学生能够学习在新环境中行动的一般技能,而独立于有待解决的任务。我们表明,我们的方法产生了一个自然的自发目标课程,最终使代理机构能够在其他内在动机和最先进的RL方法失败的情况下解决具有挑战性的工作。

1

相关内容

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

专知会员服务

12+阅读 · 2020年2月23日

【论文】量子对抗机器学习，Quantum Adversarial Machine Learning

【论文】量子对抗机器学习，Quantum Adversarial Machine Learning

专知会员服务

38+阅读 · 2020年1月5日

【论文】欺骗学习（Learning by Cheating）

【论文】欺骗学习（Learning by Cheating）

专知会员服务

28+阅读 · 2020年1月3日

【斯坦福大学ICLR2020】无任务的持续元学习，Continue Meta-learning without tasks

【斯坦福大学ICLR2020】无任务的持续元学习，Continue Meta-learning without tasks

专知会员服务

16+阅读 · 2019年12月18日

【Google新论文】Learning Transferable Graph Exploration 附论文下载

【Google新论文】Learning Transferable Graph Exploration 附论文下载

专知会员服务

8+阅读 · 2019年11月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Arxiv

0+阅读 · 2021年4月16日

Energy-Based Imitation Learning

Arxiv

0+阅读 · 2021年4月15日

Reward function shape exploration in adversarial imitation learning: an empirical study

Arxiv

0+阅读 · 2021年4月14日

A Distributionally Robust Optimization Method for Adversarial Multiple Kernel Learning

Arxiv

0+阅读 · 2021年4月13日

Contrastive Learning with Hard Negative Samples

Arxiv

7+阅读 · 2020年10月9日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

Few-shot Learning with Meta Metric Learners

Arxiv

13+阅读 · 2019年1月26日

Visual Reinforcement Learning with Imagined Goals

Arxiv

8+阅读 · 2018年7月12日

Adversarial Meta-Learning

Arxiv

7+阅读 · 2018年6月8日

Safety-aware Adaptive Reinforcement Learning with Applications to Brushbot Navigation

Arxiv

4+阅读 · 2018年1月29日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

专知会员服务

12+阅读 · 2020年2月23日

【论文】量子对抗机器学习，Quantum Adversarial Machine Learning

【论文】量子对抗机器学习，Quantum Adversarial Machine Learning

专知会员服务

38+阅读 · 2020年1月5日

【论文】欺骗学习（Learning by Cheating）

【论文】欺骗学习（Learning by Cheating）

专知会员服务

28+阅读 · 2020年1月3日

【斯坦福大学ICLR2020】无任务的持续元学习，Continue Meta-learning without tasks

【斯坦福大学ICLR2020】无任务的持续元学习，Continue Meta-learning without tasks

专知会员服务

16+阅读 · 2019年12月18日

【Google新论文】Learning Transferable Graph Exploration 附论文下载

【Google新论文】Learning Transferable Graph Exploration 附论文下载

专知会员服务

8+阅读 · 2019年11月4日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

检索增强生成（RAG）技术，261页slides

美联参会指南-联合规划与执行概述及政策框架 | 32页

从DeepSeek-R1学到的三个核心经验

大规模视觉模型中的提示式适配：综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Arxiv

0+阅读 · 2021年4月16日

Energy-Based Imitation Learning

Arxiv

0+阅读 · 2021年4月15日

Reward function shape exploration in adversarial imitation learning: an empirical study

Arxiv

0+阅读 · 2021年4月14日

A Distributionally Robust Optimization Method for Adversarial Multiple Kernel Learning

Arxiv

0+阅读 · 2021年4月13日

Contrastive Learning with Hard Negative Samples

Arxiv

7+阅读 · 2020年10月9日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

Few-shot Learning with Meta Metric Learners

Arxiv

13+阅读 · 2019年1月26日

Visual Reinforcement Learning with Imagined Goals

Arxiv

8+阅读 · 2018年7月12日

Adversarial Meta-Learning

Arxiv

7+阅读 · 2018年6月8日

Safety-aware Adaptive Reinforcement Learning with Applications to Brushbot Navigation

Arxiv

4+阅读 · 2018年1月29日

微信扫码咨询专知VIP会员