Causal Curiosity: 发现自我监督实验的RL代理商 (Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning) - 专知论文

会员服务 ·

0

因果因子 · 分解的 · 学成 · 回合 · Performer ·

2021 年 6 月 9 日

Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning

翻译：Causal Curiosity: 发现自我监督实验的RL代理商

Sumedh A. Sontakke,Arash Mehrjou,Laurent Itti,Bernhard Schölkopf

from arxiv, International Conference on Machine Learning, PMLR 139, 2021

Animals exhibit an innate ability to learn regularities of the world through interaction. By performing experiments in their environment, they are able to discern the causal factors of variation and infer how they affect the world's dynamics. Inspired by this, we attempt to equip reinforcement learning agents with the ability to perform experiments that facilitate a categorization of the rolled-out trajectories, and to subsequently infer the causal factors of the environment in a hierarchical manner. We introduce {\em causal curiosity}, a novel intrinsic reward, and show that it allows our agents to learn optimal sequences of actions and discover causal factors in the dynamics of the environment. The learned behavior allows the agents to infer a binary quantized representation for the ground-truth causal factors in every environment. Additionally, we find that these experimental behaviors are semantically meaningful (e.g., our agents learn to lift blocks to categorize them by weight), and are learnt in a self-supervised manner with approximately 2.5 times less data than conventional supervised planners. We show that these behaviors can be re-purposed and fine-tuned (e.g., from lifting to pushing or other downstream tasks). Finally, we show that the knowledge of causal factor representations aids zero-shot learning for more complex tasks. Visit https://sites.google.com/usc.edu/causal-curiosity/home for website.

翻译：动物表现出通过互动学习世界规律的天生能力。通过在环境中进行实验, 它们能够辨别变化的因果因素, 并推断它们如何影响世界的动态。受此启发, 我们试图让强化学习机构有能力进行实验, 从而方便对展出轨迹进行分类, 并随后以分级方式推断环境的因果因素。我们引入了一种新颖的内在奖赏, 并显示它允许我们的代理机构学习最优的行动序列, 并发现环境动态中的因果因素。学习后的行为使得代理机构可以推断出每个环境中地面图象因果因素的二进制量化表示。此外, 我们发现这些实验性的行为具有内涵意义( 例如, 我们的代理机构学会按重量来提升区块的分类 ), 并且以自我监督的方式学习比常规监管规划者少2.5倍的数据。我们显示, 这些行为可以被重新定位和精确地调整( 例如, 从提升/ 方向学习或下游任务 ) 。最后我们发现, 用于组织/ 更复杂的的 / 组织/ 组织/ 。学习组织。最后显示的的组织/ 。的解释/ 结构/ 的。。

1

相关内容

因果因子

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

专知会员服务

40+阅读 · 2020年9月21日

【SIGIR2020】学习词项区分性，Learning Term Discrimination

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

【google】监督对比学习，Supervised Contrastive Learning

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【MIT-伯克利-ICLR2020】对比表示蒸馏，Contrastive Representation Distillation

【MIT-伯克利-ICLR2020】对比表示蒸馏，Contrastive Representation Distillation

专知会员服务

56+阅读 · 2020年3月12日

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

专知会员服务

12+阅读 · 2020年1月7日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Automated Identification of Climate Risk Disclosures in Annual Corporate Reports

Arxiv

0+阅读 · 2021年8月3日

Self-Supervised Disentangled Representation Learning for Third-Person Imitation Learning

Arxiv

0+阅读 · 2021年8月2日

Imitation by Predicting Observations

Imitation by Predicting Observations

Arxiv

4+阅读 · 2021年7月8日

CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models

Arxiv

17+阅读 · 2021年3月23日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

Self-Supervised Learning by Cross-Modal Audio-Video Clustering

Arxiv

6+阅读 · 2020年10月26日

Learning Disentangled Representations for Recommendation

Learning Disentangled Representations for Recommendation

Arxiv

8+阅读 · 2019年10月31日

Ermes: Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification

Arxiv

6+阅读 · 2018年6月7日

Supervised and Unsupervised Transfer Learning for Question Answering

Arxiv

4+阅读 · 2018年4月21日

Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video

Arxiv

3+阅读 · 2017年12月23日

VIP会员

文章信息

相关主题

相关VIP内容

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

【硬核课】机器人学习课程，UT Austin朱玉可博士讲述自主机器人的人工智能与机器学习机器学习算法

专知会员服务

40+阅读 · 2020年9月21日

【SIGIR2020】学习词项区分性，Learning Term Discrimination

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

【google】监督对比学习，Supervised Contrastive Learning

【google】监督对比学习，Supervised Contrastive Learning

专知会员服务

32+阅读 · 2020年4月23日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【MIT-伯克利-ICLR2020】对比表示蒸馏，Contrastive Representation Distillation

【MIT-伯克利-ICLR2020】对比表示蒸馏，Contrastive Representation Distillation

专知会员服务

56+阅读 · 2020年3月12日

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

【Google无监督大规模视觉表示迁移】Large Scale Learning of General Visual Representations for Transfer

专知会员服务

12+阅读 · 2020年1月7日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

Successor representations 强化学习表示的生物学启发

Successor representations 强化学习表示的生物学启发

CreateAMind

6+阅读 · 2019年9月5日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Automated Identification of Climate Risk Disclosures in Annual Corporate Reports

Arxiv

0+阅读 · 2021年8月3日

Self-Supervised Disentangled Representation Learning for Third-Person Imitation Learning

Arxiv

0+阅读 · 2021年8月2日

Imitation by Predicting Observations

Imitation by Predicting Observations

Arxiv

4+阅读 · 2021年7月8日

CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models

Arxiv

17+阅读 · 2021年3月23日

Learning Latent Representations to Influence Multi-Agent Interaction

Arxiv

11+阅读 · 2020年11月12日

Self-Supervised Learning by Cross-Modal Audio-Video Clustering

Arxiv

6+阅读 · 2020年10月26日

Learning Disentangled Representations for Recommendation

Learning Disentangled Representations for Recommendation

Arxiv

8+阅读 · 2019年10月31日

Ermes: Emoji-Powered Representation Learning for Cross-Lingual Sentiment Classification

Arxiv

6+阅读 · 2018年6月7日

Supervised and Unsupervised Transfer Learning for Question Answering

Arxiv

4+阅读 · 2018年4月21日

Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video

Arxiv

3+阅读 · 2017年12月23日

微信扫码咨询专知VIP会员