通过产生政策模式的可调适人口剂 (Adaptable Agent Populations via a Generative Model of Policies) - 专知论文

会员服务 ·

0

生成模型 · 回合 · MoDELS · 多样性 · AIM ·

2021 年 7 月 15 日

Adaptable Agent Populations via a Generative Model of Policies

翻译：通过产生政策模式的可调适人口剂

Kenneth Derek,Phillip Isola

from arxiv, Website at https://kennyderek.github.io/adap/

In the natural world, life has found innumerable ways to survive and often thrive. Between and even within species, each individual is in some manner unique, and this diversity lends adaptability and robustness to life. In this work, we aim to learn a space of diverse and high-reward policies on any given environment. To this end, we introduce a generative model of policies, which maps a low-dimensional latent space to an agent policy space. Our method enables learning an entire population of agent policies, without requiring the use of separate policy parameters. Just as real world populations can adapt and evolve via natural selection, our method is able to adapt to changes in our environment solely by selecting for policies in latent space. We test our generative model's capabilities in a variety of environments, including an open-ended grid-world and a two-player soccer environment. Code, visualizations, and additional experiments can be found at https://kennyderek.github.io/adap/.

翻译：在自然界,生命已经找到了无数生存和经常繁荣的方法。在物种之间,甚至在物种内部,每个人都以某种方式具有独特性,这种多样性为生命提供了适应性和强健性。在这项工作中,我们的目标是在任何特定环境中学习多样化和高回报政策的空间。为此,我们引入了一种基因化的政策模型,为代理政策空间绘制了低维潜伏空间图。我们的方法使整个代理政策群得以学习,而无需使用不同的政策参数。正如现实世界人口可以通过自然选择来适应和演变,我们的方法只能通过选择潜藏空间的政策来适应我们环境的变化。我们测试我们的基因模型在各种环境中的能力,包括开放的网格世界和双人足球环境。可以在https://kenyderek.github.io/adap/上找到代码、可视化和额外实验。

1

相关内容

生成模型

在机器学习中，生成模型可以用来直接对数据建模（例如根据某个变量的概率密度函数进行数据采样），也可以用来建立变量间的条件概率分布。条件概率分布可以由生成模型根据贝叶斯定理形成。

【KDD2021】图神经网络，NUS- Xavier Bresson教授

【KDD2021】图神经网络，NUS- Xavier Bresson教授

专知会员服务

66+阅读 · 2021年8月20日

【NeurIPS 2020】图神经网络的参数化解释器，Parameterized Explainer for GNN

【NeurIPS 2020】图神经网络的参数化解释器，Parameterized Explainer for GNN

专知会员服务

22+阅读 · 2020年11月13日

最新【深度生成模型】Deep Generative Models，104页ppt

最新【深度生成模型】Deep Generative Models，104页ppt

专知会员服务

71+阅读 · 2020年10月24日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Infusing model predictive control into meta-reinforcement learning for mobile robots in dynamic environments

Arxiv

0+阅读 · 2021年9月15日

Towards Understanding the Generative Capability of Adversarially Robust Classifiers

Arxiv

0+阅读 · 2021年9月14日

Learning to Walk via Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年12月26日

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年9月17日

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms

Arxiv

4+阅读 · 2018年8月17日

ALMN: Deep Embedding Learning with Geometrical Virtual Point Generating

Arxiv

5+阅读 · 2018年6月5日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

5+阅读 · 2018年2月13日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

A Generative Model For Zero Shot Learning Using Conditional Variational Autoencoders

Arxiv

9+阅读 · 2018年1月27日

VIP会员

文章信息

相关主题

相关VIP内容

【KDD2021】图神经网络，NUS- Xavier Bresson教授

【KDD2021】图神经网络，NUS- Xavier Bresson教授

专知会员服务

66+阅读 · 2021年8月20日

【NeurIPS 2020】图神经网络的参数化解释器，Parameterized Explainer for GNN

【NeurIPS 2020】图神经网络的参数化解释器，Parameterized Explainer for GNN

专知会员服务

22+阅读 · 2020年11月13日

最新【深度生成模型】Deep Generative Models，104页ppt

最新【深度生成模型】Deep Generative Models，104页ppt

专知会员服务

71+阅读 · 2020年10月24日

迁移学习简明教程，11页ppt

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型幻觉：系统综述

《分析与预测陆军战斗体能测试表现：统计与机器学习方法》2025最新137页

【博士论文】数据与任务的物理学：深度学习中的局部性与组合性理论

代理式人工智能时代的决策优势

相关资讯

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Infusing model predictive control into meta-reinforcement learning for mobile robots in dynamic environments

Arxiv

0+阅读 · 2021年9月15日

Towards Understanding the Generative Capability of Adversarially Robust Classifiers

Arxiv

0+阅读 · 2021年9月14日

Learning to Walk via Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年12月26日

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年9月17日

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms

Arxiv

4+阅读 · 2018年8月17日

ALMN: Deep Embedding Learning with Geometrical Virtual Point Generating

Arxiv

5+阅读 · 2018年6月5日

Learning to Adapt: Meta-Learning for Model-Based Control

Arxiv

9+阅读 · 2018年3月30日

Pose-Normalized Image Generation for Person Re-identification

Arxiv

5+阅读 · 2018年2月13日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

A Generative Model For Zero Shot Learning Using Conditional Variational Autoencoders

Arxiv

9+阅读 · 2018年1月27日

微信扫码咨询专知VIP会员