使用动态远程学习自动生成目标 (Automatic Goal Generation using Dynamical Distance Learning) - 专知论文

会员服务 ·

0

学成 · Processing（编程语言） · 样本 · INTERACT · 泛函 ·

2021 年 11 月 7 日

Automatic Goal Generation using Dynamical Distance Learning

翻译：使用动态远程学习自动生成目标

Bharat Prakash,Nicholas Waytowich,Tinoosh Mohsenin,Tim Oates

Reinforcement Learning (RL) agents can learn to solve complex sequential decision making tasks by interacting with the environment. However, sample efficiency remains a major challenge. In the field of multi-goal RL, where agents are required to reach multiple goals to solve complex tasks, improving sample efficiency can be especially challenging. On the other hand, humans or other biological agents learn such tasks in a much more strategic way, following a curriculum where tasks are sampled with increasing difficulty level in order to make gradual and efficient learning progress. In this work, we propose a method for automatic goal generation using a dynamical distance function (DDF) in a self-supervised fashion. DDF is a function which predicts the dynamical distance between any two states within a markov decision process (MDP). With this, we generate a curriculum of goals at the appropriate difficulty level to facilitate efficient learning throughout the training process. We evaluate this approach on several goal-conditioned robotic manipulation and navigation tasks, and show improvements in sample efficiency over a baseline method which only uses random goal sampling.

翻译：强化学习(RL)代理商可以通过与环境互动,学会解决复杂的连续决策任务。然而,抽样效率仍然是一个重大挑战。在多目标RL领域,需要代理商达到多重目标才能解决复杂任务,提高抽样效率尤其具有挑战性。另一方面,人类或其他生物代理商以更具战略性的方式学习这些任务,遵循一个任务抽样越来越困难的课程,以便逐步和高效地学习。在这项工作中,我们提议一种自动目标生成方法,使用动态距离函数(DDDF),以自我监督的方式生成目标。DDDF是一种功能,它预测在马克夫决策过程(MDP)中任何两个国家之间的动态距离。我们以此在适当的困难水平上制定目标课程,以促进在整个培训过程中高效学习。我们评估了几个有目标的机器人操纵和导航任务,并表明在仅使用随机目标抽样的基线方法上抽样效率的提高。

0

相关内容

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

专知会员服务

124+阅读 · 2019年12月23日

【斯坦福大学】TASO:基于深度学习优化的自动生成图变换（TASO: Optimizing Deep Learning with Automatic Generation of Graph Substitutions），35页ppt

【斯坦福大学】TASO:基于深度学习优化的自动生成图变换（TASO: Optimizing Deep Learning with Automatic Generation of Graph Substitutions），35页ppt

专知会员服务

10+阅读 · 2019年12月22日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation

DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation

Arxiv

0+阅读 · 2022年1月11日

SABLAS: Learning Safe Control for Black-box Dynamical Systems

Arxiv

0+阅读 · 2022年1月9日

The Neural Coding Framework for Learning Generative Models

Arxiv

0+阅读 · 2022年1月4日

Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年1月3日

Molecular graph generation with Graph Neural Networks

Arxiv

3+阅读 · 2021年5月27日

Co-Generation with GANs using AIS based HMC

Arxiv

3+阅读 · 2019年10月31日

Deep Learning for Energy Markets

Deep Learning for Energy Markets

Arxiv

10+阅读 · 2019年4月10日

Paraphrase Generation with Deep Reinforcement Learning

Paraphrase Generation with Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年8月23日

Multiple Object Detection, Tracking and Long-Term Dynamics Learning in Large 3D Maps

Arxiv

6+阅读 · 2018年1月28日

A Generative Model For Zero Shot Learning Using Conditional Variational Autoencoders

Arxiv

9+阅读 · 2018年1月27日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

专知会员服务

124+阅读 · 2019年12月23日

【斯坦福大学】TASO:基于深度学习优化的自动生成图变换（TASO: Optimizing Deep Learning with Automatic Generation of Graph Substitutions），35页ppt

【斯坦福大学】TASO:基于深度学习优化的自动生成图变换（TASO: Optimizing Deep Learning with Automatic Generation of Graph Substitutions），35页ppt

专知会员服务

10+阅读 · 2019年12月22日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation

DDG-DA: Data Distribution Generation for Predictable Concept Drift Adaptation

Arxiv

0+阅读 · 2022年1月11日

SABLAS: Learning Safe Control for Black-box Dynamical Systems

Arxiv

0+阅读 · 2022年1月9日

The Neural Coding Framework for Learning Generative Models

Arxiv

0+阅读 · 2022年1月4日

Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年1月3日

Molecular graph generation with Graph Neural Networks

Arxiv

3+阅读 · 2021年5月27日

Co-Generation with GANs using AIS based HMC

Arxiv

3+阅读 · 2019年10月31日

Deep Learning for Energy Markets

Deep Learning for Energy Markets

Arxiv

10+阅读 · 2019年4月10日

Paraphrase Generation with Deep Reinforcement Learning

Paraphrase Generation with Deep Reinforcement Learning

Arxiv

4+阅读 · 2018年8月23日

Multiple Object Detection, Tracking and Long-Term Dynamics Learning in Large 3D Maps

Arxiv

6+阅读 · 2018年1月28日

A Generative Model For Zero Shot Learning Using Conditional Variational Autoencoders

Arxiv

9+阅读 · 2018年1月27日

微信扫码咨询专知VIP会员