通过专家示范自动课程 (Automatic Curricula via Expert Demonstrations) - 专知论文

会员服务 ·

0

学成 · Performance · 块 · 稀疏 · 强化学习 ·

2021 年 6 月 16 日

Automatic Curricula via Expert Demonstrations

翻译：通过专家示范自动课程

Siyu Dai,Andreas Hofmann,Brian Williams

from arxiv, Preprint, work in progress

We propose Automatic Curricula via Expert Demonstrations (ACED), a reinforcement learning (RL) approach that combines the ideas of imitation learning and curriculum learning in order to solve challenging robotic manipulation tasks with sparse reward functions. Curriculum learning solves complicated RL tasks by introducing a sequence of auxiliary tasks with increasing difficulty, yet how to automatically design effective and generalizable curricula remains a challenging research problem. ACED extracts curricula from a small amount of expert demonstration trajectories by dividing demonstrations into sections and initializing training episodes to states sampled from different sections of demonstrations. Through moving the reset states from the end to the beginning of demonstrations as the learning agent improves its performance, ACED not only learns challenging manipulation tasks with unseen initializations and goals, but also discovers novel solutions that are distinct from the demonstrations. In addition, ACED can be naturally combined with other imitation learning methods to utilize expert demonstrations in a more efficient manner, and we show that a combination of ACED with behavior cloning allows pick-and-place tasks to be learned with as few as 1 demonstration and block stacking tasks to be learned with 20 demonstrations.

翻译：我们提出专家示范自动课程(ACED),这是一种强化学习(RL)方法,将模仿学习和课程学习的理念结合起来,以便解决挑战性机器人操纵任务,同时使用微弱的奖励功能。课程学习通过引入一系列辅助任务来解决复杂的RL任务,困难越来越大,然而,如何自动设计有效和通用的课程仍是一个具有挑战性的研究问题。ACED从少数专家示范轨迹中抽取课程,将示范分为几个部分,并开始培训阶段,以便从不同示威的样本中抽取到各州。通过将重新设置的州从最后移到开始,随着学习机构改进其绩效,ACED不仅学会挑战以无形初始化和目标进行操作的任务,而且还发现与演示不同的新颖解决办法。此外,ACED可以自然地与其他模仿学习方法相结合,以便以更有效的方式利用专家演示,我们证明将ACED与行为克隆结合在一起,可以把选择和选择任务从一个演示中学习到20个演示。

0

相关内容

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

最新《模仿学习(Imitation Learning》进展报告, 加州理工Yisong Yue教授，附下载

最新《模仿学习(Imitation Learning》进展报告, 加州理工Yisong Yue教授，附下载

专知会员服务

41+阅读 · 2020年12月6日

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

专知会员服务

80+阅读 · 2020年6月11日

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

专知会员服务

21+阅读 · 2020年4月30日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

意识是一种数学模式

意识是一种数学模式

CreateAMind

3+阅读 · 2019年6月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay

Arxiv

0+阅读 · 2021年8月17日

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks

Arxiv

1+阅读 · 2021年8月17日

DexMV: Imitation Learning for Dexterous Manipulation from Human Videos

Arxiv

0+阅读 · 2021年8月17日

Generalization Through Hand-Eye Coordination: An Action Space for Learning Spatially-Invariant Visuomotor Control

Arxiv

0+阅读 · 2021年8月17日

Adaptive Selection of Informative Path Planning Strategies via Reinforcement Learning

Arxiv

1+阅读 · 2021年8月14日

Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration

Arxiv

0+阅读 · 2021年8月13日

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Arxiv

5+阅读 · 2019年6月18日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

Multiple Object Detection, Tracking and Long-Term Dynamics Learning in Large 3D Maps

Arxiv

6+阅读 · 2018年1月28日

VIP会员

文章信息

相关主题

相关VIP内容

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

最新《模仿学习(Imitation Learning》进展报告, 加州理工Yisong Yue教授，附下载

最新《模仿学习(Imitation Learning》进展报告, 加州理工Yisong Yue教授，附下载

专知会员服务

41+阅读 · 2020年12月6日

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

【ICML2020】多视角对比图表示学习，Contrastive Multi-View GRL

专知会员服务

80+阅读 · 2020年6月11日

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

【ACL2020】DeeBERT:动态加速BERT推理，DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

专知会员服务

21+阅读 · 2020年4月30日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

大型语言模型遇上文本属性图：一种融合框架与应用的综述

人工智能赋能自主武器与人类控制第三部分：人类控制与系统操作员 | 35页

【博士论文】用于概率程序与生成模型的变分推断

军事指挥控制系统：2025年5种用途

相关资讯

意识是一种数学模式

意识是一种数学模式

CreateAMind

3+阅读 · 2019年6月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay

Arxiv

0+阅读 · 2021年8月17日

TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks

Arxiv

1+阅读 · 2021年8月17日

DexMV: Imitation Learning for Dexterous Manipulation from Human Videos

Arxiv

0+阅读 · 2021年8月17日

Generalization Through Hand-Eye Coordination: An Action Space for Learning Spatially-Invariant Visuomotor Control

Arxiv

0+阅读 · 2021年8月17日

Adaptive Selection of Informative Path Planning Strategies via Reinforcement Learning

Arxiv

1+阅读 · 2021年8月14日

Co-GAIL: Learning Diverse Strategies for Human-Robot Collaboration

Arxiv

0+阅读 · 2021年8月13日

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Arxiv

5+阅读 · 2019年6月18日

Reward learning from human preferences and demonstrations in Atari

Arxiv

8+阅读 · 2018年11月15日

End-to-end Active Object Tracking via Reinforcement Learning

Arxiv

3+阅读 · 2018年6月1日

Multiple Object Detection, Tracking and Long-Term Dynamics Learning in Large 3D Maps

Arxiv

6+阅读 · 2018年1月28日

微信扫码咨询专知VIP会员