SEMI: 通过多重感知性聚合体进行自我监督的探索 (SEMI: Self-supervised Exploration via Multisensory Incongruity) - 专知论文

会员服务 ·

0

学成 · 方差 · 预测器/决策函数 · 回合 · 稀疏 ·

2022 年 5 月 19 日

SEMI: Self-supervised Exploration via Multisensory Incongruity

翻译：SEMI: 通过多重感知性聚合体进行自我监督的探索

Jianren Wang,Ziwen Zhuang,Hang Zhao

from arxiv, Accepted at ICRA 2022

Efficient exploration is a long-standing problem in reinforcement learning since extrinsic rewards are usually sparse or missing. A popular solution to this issue is to feed an agent with novelty signals as intrinsic rewards. In this work, we introduce SEMI, a self-supervised exploration policy by incentivizing the agent to maximize a new novelty signal: multisensory incongruity, which can be measured in two aspects, perception incongruity and action incongruity. The former represents the misalignment of the multisensory inputs, while the latter represents the variance of an agent's policies under different sensory inputs. Specifically, an alignment predictor is learned to detect whether multiple sensory inputs are aligned, the error of which is used to measure perception incongruity. A policy model takes different combinations of the multisensory observations as input and outputs actions for exploration. The variance of actions is further used to measure action incongruity. Using both incongruities as intrinsic rewards, SEMI allows an agent to learn skills by exploring in a self-supervised manner without any external rewards. We further show that SEMI is compatible with extrinsic rewards and it improves sample efficiency of policy learning. The effectiveness of SEMI is demonstrated across a variety of benchmark environments including object manipulation and audio-visual games.

翻译：高效的勘探是加强学习的长期问题,因为外部回报通常很少或缺少,因此在加强学习方面是一个长期存在的问题。这个问题的流行解决办法是用新信号作为内在回报为代理人提供食物。在这项工作中,我们引入了SEMI,这是一个自我监督的勘探政策,激励代理人最大限度地增加新新新信号:多感知性不一致和行为不协调,可以从两个方面衡量多感性和一致性。前者代表多种感知性投入的不匹配性,而后者则代表不同感官投入下代理人政策的差异。具体地说,我们学习一个校准预测器,以检测多种感知性投入是否一致,这种错误用来测量感知性格的认知性。一个政策模型采用多种感知性观测的不同组合,作为勘探的投入和产出行动不协调性。行动的差异还被用来衡量一致性行动。利用多种感知性游戏的内在回报,SEMI使代理人能够学习技能,以自我监督的方式探索目标性投入,而没有外部回报。我们展示的是,SEMI进一步展示了多种政策性环境的学习。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

RnCoX3n+2(R=Y,Sc,Zr,Hf,Sm Pr,Ce等,n=1,2,∞,X=Ga,In)化合物中的新超导体探索

国家自然科学基金

0+阅读 · 2014年12月31日

分子光开关用于嵌段共聚物自组装纳米结构的超分辨荧光成像

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

新型II-VI/III-V族多结叠层太阳电池材料与器件研究

国家自然科学基金

0+阅读 · 2012年12月31日

多功能金纳米棒诊疗系统在肿瘤靶向与光热治疗以及肿瘤成像中的研究

国家自然科学基金

0+阅读 · 2012年12月31日

并轴II-VI/IV纳米线异质结构的电子学性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

地球磁尾等离子体片及其边界层内低频波动研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于Si/SiC异质结的非紫外光控SiC大功率电力电子器件可行性研究

国家自然科学基金

0+阅读 · 2011年12月31日

多重齐次多项式优化的近似算法及其应用

国家自然科学基金

0+阅读 · 2011年12月31日

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning

Arxiv

0+阅读 · 2022年7月8日

RoVaR: Robust Multi-agent Tracking through Dual-layer Diversity in Visual and RF Sensor Fusion

Arxiv

0+阅读 · 2022年7月6日

An Edge-Cloud based Reference Architecture to support cognitive solutions in the Process Industry

Arxiv

0+阅读 · 2022年7月6日

Compositional Generalization in Grounded Language Learning via Induced Model Sparsity

Arxiv

0+阅读 · 2022年7月6日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Arxiv

13+阅读 · 2022年3月29日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

预测器/决策函数

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning

Arxiv

0+阅读 · 2022年7月8日

RoVaR: Robust Multi-agent Tracking through Dual-layer Diversity in Visual and RF Sensor Fusion

Arxiv

0+阅读 · 2022年7月6日

An Edge-Cloud based Reference Architecture to support cognitive solutions in the Process Industry

Arxiv

0+阅读 · 2022年7月6日

Compositional Generalization in Grounded Language Learning via Induced Model Sparsity

Arxiv

0+阅读 · 2022年7月6日

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Arxiv

13+阅读 · 2022年3月29日

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Arxiv

18+阅读 · 2021年4月4日

A Survey on Multi-Task Learning

Arxiv

31+阅读 · 2021年3月29日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Arxiv

15+阅读 · 2020年12月15日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

相关基金

RnCoX3n+2(R=Y,Sc,Zr,Hf,Sm Pr,Ce等,n=1,2,∞,X=Ga,In)化合物中的新超导体探索

国家自然科学基金

0+阅读 · 2014年12月31日

分子光开关用于嵌段共聚物自组装纳米结构的超分辨荧光成像

国家自然科学基金

0+阅读 · 2014年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

半导体衬底上FeSe薄膜的外延生长及界面超导

国家自然科学基金

0+阅读 · 2013年12月31日

新型II-VI/III-V族多结叠层太阳电池材料与器件研究

国家自然科学基金

0+阅读 · 2012年12月31日

多功能金纳米棒诊疗系统在肿瘤靶向与光热治疗以及肿瘤成像中的研究

国家自然科学基金

0+阅读 · 2012年12月31日

并轴II-VI/IV纳米线异质结构的电子学性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

地球磁尾等离子体片及其边界层内低频波动研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于Si/SiC异质结的非紫外光控SiC大功率电力电子器件可行性研究

国家自然科学基金

0+阅读 · 2011年12月31日

多重齐次多项式优化的近似算法及其应用

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员