相对变化内在控制 (Relative Variational Intrinsic Control) - 专知论文

会员服务 ·

0

回合 · 多样性 · 学成 · 控制器 · 可辨认的 ·

2020 年 12 月 14 日

Relative Variational Intrinsic Control

翻译：相对变化内在控制

Kate Baumli,David Warde-Farley,Steven Hansen,Volodymyr Mnih

from arxiv, Accepted by AAAI2021

In the absence of external rewards, agents can still learn useful behaviors by identifying and mastering a set of diverse skills within their environment. Existing skill learning methods use mutual information objectives to incentivize each skill to be diverse and distinguishable from the rest. However, if care is not taken to constrain the ways in which the skills are diverse, trivially diverse skill sets can arise. To ensure useful skill diversity, we propose a novel skill learning objective, Relative Variational Intrinsic Control (RVIC), which incentivizes learning skills that are distinguishable in how they change the agent's relationship to its environment. The resulting set of skills tiles the space of affordances available to the agent. We qualitatively analyze skill behaviors on multiple environments and show how RVIC skills are more useful than skills discovered by existing methods when used in hierarchical reinforcement learning.

翻译：在没有外部奖励的情况下,代理商仍然可以通过在环境中发现和掌握一套不同的技能来学习有用的行为。现有的技能学习方法使用相互的信息目标来激励每一种技能的多样化和区别于其他技能。然而,如果不注意限制技能多样性的方式,就会出现微乎其微的多样化技能组合。为了确保有用的技能多样性,我们提出了一个新的技能学习目标,即相对变化式的内在控制(RVIC),它鼓励学习技能,这些技能在如何改变代理商与环境的关系方面可以辨别。由此产生的一套技能将代理商可利用的支付空间打乱成一块。我们从质量上分析多种环境中的技能行为,并展示RVIC技能如何比在等级强化学习中使用现有方法发现的技能更有用。

0

相关内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

最新《概率分布的希尔伯特空间表示的最新进展》136页ppt与147页电子书

最新《概率分布的希尔伯特空间表示的最新进展》136页ppt与147页电子书

专知会员服务

58+阅读 · 2020年7月13日

生成式对抗网络GAN在计算机视觉中的应用概述，GANs in computer vision: Introduction to generative learning（part1）

生成式对抗网络GAN在计算机视觉中的应用概述，GANs in computer vision: Introduction to generative learning（part1）

专知会员服务

63+阅读 · 2020年4月19日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

专知会员服务

56+阅读 · 2020年3月26日

【中科院计算所】深几何学习综述:从表征的角度，A Survey on Deep Geometry Learning: From a Representation Perspective

【中科院计算所】深几何学习综述:从表征的角度，A Survey on Deep Geometry Learning: From a Representation Perspective

专知会员服务

51+阅读 · 2020年2月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

已删除

将门创投

3+阅读 · 2019年4月19日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Exploration with Intrinsic Motivation using Object-Action-Outcome Latent Space

Exploration with Intrinsic Motivation using Object-Action-Outcome Latent Space

Arxiv

0+阅读 · 2021年2月17日

Quantifying environment and population diversity in multi-agent reinforcement learning

Quantifying environment and population diversity in multi-agent reinforcement learning

Arxiv

0+阅读 · 2021年2月16日

Controlling False Discovery Rates Using Null Bootstrapping

Arxiv

0+阅读 · 2021年2月15日

Cooperation and Reputation Dynamics with Reinforcement Learning

Arxiv

0+阅读 · 2021年2月15日

Contrastive latent variable modeling with application to case-control sequencing experiments

Arxiv

0+阅读 · 2021年2月12日

Object-centric Forward Modeling for Model Predictive Control

Object-centric Forward Modeling for Model Predictive Control

Arxiv

5+阅读 · 2019年10月8日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Controllable Generative Adversarial Network

Arxiv

5+阅读 · 2018年5月1日

Understanding disentangling in $β$-VAE

Arxiv

4+阅读 · 2018年4月10日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

最新《概率分布的希尔伯特空间表示的最新进展》136页ppt与147页电子书

最新《概率分布的希尔伯特空间表示的最新进展》136页ppt与147页电子书

专知会员服务

58+阅读 · 2020年7月13日

生成式对抗网络GAN在计算机视觉中的应用概述，GANs in computer vision: Introduction to generative learning（part1）

生成式对抗网络GAN在计算机视觉中的应用概述，GANs in computer vision: Introduction to generative learning（part1）

专知会员服务

63+阅读 · 2020年4月19日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

【CMU-Spring2020课程】离散微分几何15讲，Discrete Differential Geometry

专知会员服务

56+阅读 · 2020年3月26日

【中科院计算所】深几何学习综述:从表征的角度，A Survey on Deep Geometry Learning: From a Representation Perspective

【中科院计算所】深几何学习综述:从表征的角度，A Survey on Deep Geometry Learning: From a Representation Perspective

专知会员服务

51+阅读 · 2020年2月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

280+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】开源权重模型的知识蒸馏检测

多模态大语言模型遇见多模态情绪识别与推理：综述

【NTU博士论文】面向高效感知与可扩展生成的三维物理世界

CMU《生成式人工智能》最新课程

相关资讯

已删除

将门创投

3+阅读 · 2019年4月19日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Exploration with Intrinsic Motivation using Object-Action-Outcome Latent Space

Exploration with Intrinsic Motivation using Object-Action-Outcome Latent Space

Arxiv

0+阅读 · 2021年2月17日

Quantifying environment and population diversity in multi-agent reinforcement learning

Quantifying environment and population diversity in multi-agent reinforcement learning

Arxiv

0+阅读 · 2021年2月16日

Controlling False Discovery Rates Using Null Bootstrapping

Arxiv

0+阅读 · 2021年2月15日

Cooperation and Reputation Dynamics with Reinforcement Learning

Arxiv

0+阅读 · 2021年2月15日

Contrastive latent variable modeling with application to case-control sequencing experiments

Arxiv

0+阅读 · 2021年2月12日

Object-centric Forward Modeling for Model Predictive Control

Object-centric Forward Modeling for Model Predictive Control

Arxiv

5+阅读 · 2019年10月8日

Variational Bayesian Reinforcement Learning with Regret Bounds

Arxiv

3+阅读 · 2018年7月25日

Controllable Generative Adversarial Network

Arxiv

5+阅读 · 2018年5月1日

Understanding disentangling in $β$-VAE

Arxiv

4+阅读 · 2018年4月10日

Parameter Space Noise for Exploration

Arxiv

3+阅读 · 2018年1月31日

微信扫码咨询专知VIP会员