目标有条件的深层政策发电机 (Goal-Conditioned Generators of Deep Policies) - 专知论文

会员服务 ·

0

Learning · Weight · Better · 期望回报 · Continuity ·

2022 年 7 月 4 日

Goal-Conditioned Generators of Deep Policies

翻译：目标有条件的深层政策发电机

Francesco Faccio,Vincent Herrmann,Aditya Ramesh,Louis Kirsch,Jürgen Schmidhuber

from arxiv, Preprint. Under Review

Goal-conditioned Reinforcement Learning (RL) aims at learning optimal policies, given goals encoded in special command inputs. Here we study goal-conditioned neural nets (NNs) that learn to generate deep NN policies in form of context-specific weight matrices, similar to Fast Weight Programmers and other methods from the 1990s. Using context commands of the form "generate a policy that achieves a desired expected return," our NN generators combine powerful exploration of parameter space with generalization across commands to iteratively find better and better policies. A form of weight-sharing HyperNetworks and policy embeddings scales our method to generate deep NNs. Experiments show how a single learned policy generator can produce policies that achieve any return seen during training. Finally, we evaluate our algorithm on a set of continuous control tasks where it exhibits competitive performance. Our code is public.

翻译：以目标为条件的强化学习(RL) 旨在学习最佳政策,在特殊指令投入中将目标编码为目标。我们在这里研究以目标为条件的神经网(NNs),以学习以特定环境重量矩阵的形式产生深度的NN政策,类似于1990年代的快速重力程序员和其他方法。使用“生成一个预期回报的政策”的形式的上下文指令,我们的NN发电机将强力探索参数空间和跨命令的通用结合起来,以便迭接地找到更好更好的政策。一种形式的权重共享超网络和政策嵌入我们产生深度NNN的方法。实验显示单个学习的政策生成者如何产生在培训中看到的任何回报的政策。最后,我们评估一系列持续控制任务的算法,只要它表现出竞争性的绩效。我们的代码是公开的。

0

相关内容

Learning

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

LncRNA作为急性单核细胞白血病分子标志物的验证及其调控功能研究

国家自然科学基金

0+阅读 · 2015年12月31日

线粒体TRAP1分子介导Ago2蛋白表达在肠癌转移中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

SF6替代气体及其混合物在普冷温区的绝缘性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

雌激素通过ERα介导lncRNA 1200076调节卵巢ERα（+）细胞生物学行为

国家自然科学基金

0+阅读 · 2012年12月31日

amiRNA干扰NMHC II-A对PRRSV感染细胞凋亡信号传导的影响及机制

国家自然科学基金

0+阅读 · 2012年12月31日

血管内皮细胞自噬的剪切应力调控及其在As中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

TLR4 /ROS信号通路在AMD发病中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

云南双江—沧源新近纪植物多样性与古环境

国家自然科学基金

0+阅读 · 2011年12月31日

CIB1对脑缺血半暗带微血管作用机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning

Arxiv

0+阅读 · 2022年8月24日

Towards an Awareness of Time Series Anomaly Detection Models' Adversarial Vulnerability

Arxiv

0+阅读 · 2022年8月24日

Detect Hate Speech in Unseen Domains using Multi-Task Learning: A Case Study of Political Public Figures

Arxiv

0+阅读 · 2022年8月22日

Minimax-Optimal Multi-Agent RL in Zero-Sum Markov Games With a Generative Model

Arxiv

0+阅读 · 2022年8月22日

Locomotion Policy Guided Traversability Learning using Volumetric Representations of Complex Environments

Arxiv

0+阅读 · 2022年8月21日

Controllable Data Generation by Deep Learning: A Review

Arxiv

15+阅读 · 2022年7月19日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

Constructing Narrative Event Evolutionary Graph for Script Event Prediction

Arxiv

11+阅读 · 2018年5月16日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

相关论文

Self-Supervised Exploration via Temporal Inconsistency in Reinforcement Learning

Arxiv

0+阅读 · 2022年8月24日

Towards an Awareness of Time Series Anomaly Detection Models' Adversarial Vulnerability

Arxiv

0+阅读 · 2022年8月24日

Detect Hate Speech in Unseen Domains using Multi-Task Learning: A Case Study of Political Public Figures

Arxiv

0+阅读 · 2022年8月22日

Minimax-Optimal Multi-Agent RL in Zero-Sum Markov Games With a Generative Model

Arxiv

0+阅读 · 2022年8月22日

Locomotion Policy Guided Traversability Learning using Volumetric Representations of Complex Environments

Arxiv

0+阅读 · 2022年8月21日

Controllable Data Generation by Deep Learning: A Review

Arxiv

15+阅读 · 2022年7月19日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

Constructing Narrative Event Evolutionary Graph for Script Event Prediction

Arxiv

11+阅读 · 2018年5月16日

相关基金

LncRNA作为急性单核细胞白血病分子标志物的验证及其调控功能研究

国家自然科学基金

0+阅读 · 2015年12月31日

线粒体TRAP1分子介导Ago2蛋白表达在肠癌转移中的作用机制

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

SF6替代气体及其混合物在普冷温区的绝缘性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

雌激素通过ERα介导lncRNA 1200076调节卵巢ERα（+）细胞生物学行为

国家自然科学基金

0+阅读 · 2012年12月31日

amiRNA干扰NMHC II-A对PRRSV感染细胞凋亡信号传导的影响及机制

国家自然科学基金

0+阅读 · 2012年12月31日

血管内皮细胞自噬的剪切应力调控及其在As中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

TLR4 /ROS信号通路在AMD发病中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

云南双江—沧源新近纪植物多样性与古环境

国家自然科学基金

0+阅读 · 2011年12月31日

CIB1对脑缺血半暗带微血管作用机制的研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员