在强化学习中进行有内在动力的自督学习 (Intrinsically Motivated Self-supervised Learning in Reinforcement Learning) - 专知论文

会员服务 ·

0

学成 · INFORMS · 强化学习 · 损失 · 讲稿 ·

2022 年 5 月 13 日

Intrinsically Motivated Self-supervised Learning in Reinforcement Learning

翻译：在强化学习中进行有内在动力的自督学习

Yue Zhao,Chenzhuang Du,Hang Zhao,Tiejun Li

In vision-based reinforcement learning (RL) tasks, it is prevalent to assign auxiliary tasks with a surrogate self-supervised loss so as to obtain more semantic representations and improve sample efficiency. However, abundant information in self-supervised auxiliary tasks has been disregarded, since the representation learning part and the decision-making part are separated. To sufficiently utilize information in auxiliary tasks, we present a simple yet effective idea to employ self-supervised loss as an intrinsic reward, called Intrinsically Motivated Self-Supervised learning in Reinforcement learning (IM-SSR). We formally show that the self-supervised loss can be decomposed as exploration for novel states and robustness improvement from nuisance elimination. IM-SSR can be effortlessly plugged into any reinforcement learning with self-supervised auxiliary objectives with nearly no additional cost. Combined with IM-SSR, the previous underlying algorithms achieve salient improvements on both sample efficiency and generalization in various vision-based robotics tasks from the DeepMind Control Suite, especially when the reward signal is sparse.

翻译：在基于愿景的强化学习(RL)任务中,通常会指派具有代用自我监督损失的辅助任务,以便获得更多的语义表,提高样本效率;然而,由于自我监督的辅助任务中的大量信息被忽略,因为代用学习部分和决策部分是分开的;为了在辅助任务中充分利用信息,我们提出了一个简单而有效的想法,即利用自我监督的损失作为内在奖励,称为 " 内在动力驱动自我监督的加强学习自我监督学习(IM-SSR) " 。我们正式表明,自我监督的损失可以被解脱,作为探索新状态和从消除麻烦中加强性改进的强化学习,因为代用自我监督的辅助目标几乎不增加费用。与IM-SSR相结合,以前的基本算法在深敏化控制套件的各种基于愿景的机器人任务中的抽样效率和普及方面都取得了显著的改进,特别是当奖励信号很少的时候。

0

相关内容

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

GTAT4和Myocardin相互作用调控心肌肥厚

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

miR34a经Notch1通路调节COXⅡ在COPD中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于Tetrolet变换的偏振遥感图像融合算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

癌/睾丸抗原HCA587对转录因子NF-κB的调节作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于线性约束等价分解简化的Petri网控制器设计

国家自然科学基金

0+阅读 · 2009年12月31日

渗流及相关随机系统的极限行为研究

国家自然科学基金

0+阅读 · 2009年12月31日

隐孢子虫CypA对CD4+T细胞分化和功能的调控及其作用机制

国家自然科学基金

0+阅读 · 2009年12月31日

C反应蛋白基因多态与生活方式的交互作用对C反应蛋白水平的影响

国家自然科学基金

0+阅读 · 2009年12月31日

新BRCA1剪接异构体在乳腺癌细胞中的功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

Class Impression for Data-free Incremental Learning

Arxiv

0+阅读 · 2022年7月4日

Functional Optimization Reinforcement Learning for Real-Time Bidding

Arxiv

0+阅读 · 2022年7月3日

Computational-Statistical Gaps in Reinforcement Learning

Arxiv

0+阅读 · 2022年7月3日

Depth-CUPRL: Depth-Imaged Contrastive Unsupervised Prioritized Representations in Reinforcement Learning for Mapless Navigation of Unmanned Aerial Vehicles

Arxiv

0+阅读 · 2022年7月1日

Instance-level loss based multiple-instance learning framework for acoustic scene classification

Arxiv

0+阅读 · 2022年6月30日

How to Leverage Unlabeled Data in Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月29日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Arxiv

13+阅读 · 2020年12月3日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

相关VIP内容

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

强化学习的对比无监督表示，CURL: Contrastive Unsupervised Representations for Reinforcement Learning

专知会员服务

41+阅读 · 2020年4月11日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

【医学图像处理中的因果性】52页ppt，Causality Matters in Medical Imaging

专知会员服务

60+阅读 · 2020年3月14日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

相关论文

Class Impression for Data-free Incremental Learning

Arxiv

0+阅读 · 2022年7月4日

Functional Optimization Reinforcement Learning for Real-Time Bidding

Arxiv

0+阅读 · 2022年7月3日

Computational-Statistical Gaps in Reinforcement Learning

Arxiv

0+阅读 · 2022年7月3日

Depth-CUPRL: Depth-Imaged Contrastive Unsupervised Prioritized Representations in Reinforcement Learning for Mapless Navigation of Unmanned Aerial Vehicles

Arxiv

0+阅读 · 2022年7月1日

Instance-level loss based multiple-instance learning framework for acoustic scene classification

Arxiv

0+阅读 · 2022年6月30日

How to Leverage Unlabeled Data in Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年6月29日

Graph Self-Supervised Learning: A Survey

Arxiv

15+阅读 · 2021年8月5日

Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

Arxiv

13+阅读 · 2020年12月3日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

GTAT4和Myocardin相互作用调控心肌肥厚

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

miR34a经Notch1通路调节COXⅡ在COPD中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于Tetrolet变换的偏振遥感图像融合算法研究

国家自然科学基金

0+阅读 · 2012年12月31日

癌/睾丸抗原HCA587对转录因子NF-κB的调节作用研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于线性约束等价分解简化的Petri网控制器设计

国家自然科学基金

0+阅读 · 2009年12月31日

渗流及相关随机系统的极限行为研究

国家自然科学基金

0+阅读 · 2009年12月31日

隐孢子虫CypA对CD4+T细胞分化和功能的调控及其作用机制

国家自然科学基金

0+阅读 · 2009年12月31日

C反应蛋白基因多态与生活方式的交互作用对C反应蛋白水平的影响

国家自然科学基金

0+阅读 · 2009年12月31日

新BRCA1剪接异构体在乳腺癌细胞中的功能研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员