通过评分序列分配的典型功能促进普遍化的学习任务相关代表性 (Learning Task-relevant Representations for Generalization via Characteristic Functions of Reward Sequence Distributions) - 专知论文

会员服务 ·

0

Learning · 泛化理论 · INFORMS · Performer · 泛函 ·

2022 年 6 月 9 日

Learning Task-relevant Representations for Generalization via Characteristic Functions of Reward Sequence Distributions

翻译：通过评分序列分配的典型功能促进普遍化的学习任务相关代表性

Rui Yang,Jie Wang,Zijie Geng,Mingxuan Ye,Shuiwang Ji,Bin Li,Feng Wu

from arxiv, Accepted to KDD 2022

Generalization across different environments with the same tasks is critical for successful applications of visual reinforcement learning (RL) in real scenarios. However, visual distractions -- which are common in real scenes -- from high-dimensional observations can be hurtful to the learned representations in visual RL, thus degrading the performance of generalization. To tackle this problem, we propose a novel approach, namely Characteristic Reward Sequence Prediction (CRESP), to extract the task-relevant information by learning reward sequence distributions (RSDs), as the reward signals are task-relevant in RL and invariant to visual distractions. Specifically, to effectively capture the task-relevant information via RSDs, CRESP introduces an auxiliary task -- that is, predicting the characteristic functions of RSDs -- to learn task-relevant representations, because we can well approximate the high-dimensional distributions by leveraging the corresponding characteristic functions. Experiments demonstrate that CRESP significantly improves the performance of generalization on unseen environments, outperforming several state-of-the-arts on DeepMind Control tasks with different visual distractions.

翻译：不同环境的普及与任务相同,对于在真实情景中成功应用视觉强化学习(RL)至关重要。然而,从高维观测到视觉强化学习(RL)的视觉分流 -- -- 这是真实场景中常见的 -- -- 可能会对视觉强化学习(RL)的学习表现造成伤害,从而降低一般化的绩效。为了解决这一问题,我们提议了一种新颖的方法,即典型的再分流序列预测(CRESP),通过学习奖励序列分布来提取与任务相关的信息,因为奖励信号在RL中与任务相关,并且与视觉分散有关。具体地说,为了通过 RSD有效捕捉任务相关信息,CRESP引入了一项辅助任务 -- -- 即预测RSD的特性功能 -- -- 学习与任务相关的表现,因为我们可以通过利用相应的特性函数来非常接近高维度分布。实验表明,CRESP大大改进了对看不见环境的概括性表现,比DepMed控制任务中不同视觉分流。

0

相关内容

Learning

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

专知会员服务

30+阅读 · 2022年2月22日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

231+阅读 · 2022年2月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

肝癌活化的星状细胞促进树突状细胞DIgR2表达致肝癌免疫抑制的研究

国家自然科学基金

0+阅读 · 2014年12月31日

考虑观测值时空相关性的InSAR三维形变估计方法

国家自然科学基金

0+阅读 · 2013年12月31日

miR-449a调控Notch信号通路在移植物排斥中的作用和机制

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

自组装方法合成具有多级结构的介孔复合材料

国家自然科学基金

0+阅读 · 2012年12月31日

免疫细胞视频图像形变度量方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

长非编码RNA-UCA1调节循环肿瘤细胞抵抗NK细胞杀伤机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

浸润的树突状细胞和巨噬细胞表型和功能转化—#8212;尖锐湿疣免疫逃逸的新机制

国家自然科学基金

0+阅读 · 2009年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

Model-based Unbiased Learning to Rank

Arxiv

0+阅读 · 2022年7月24日

A Dual Accelerated Method for Online Stochastic Distributed Averaging: From Consensus to Decentralized Policy Evaluation

Arxiv

0+阅读 · 2022年7月23日

ERA: Expert Retrieval and Assembly for Early Action Prediction

Arxiv

0+阅读 · 2022年7月22日

Contrastive Learning with Complex Heterogeneity

Contrastive Learning with Complex Heterogeneity

Arxiv

0+阅读 · 2022年7月21日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

【开放书】卡耐基梅隆大学Elaine Shi 教授《Foundations of Distributed Consensus and Blockchains（分布式共识和区块链的基础）》150页pdf

专知会员服务

30+阅读 · 2022年2月22日

【2022新书】强化学习工业应用，408页pdf

【2022新书】强化学习工业应用，408页pdf

专知会员服务

231+阅读 · 2022年2月3日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Model-based Unbiased Learning to Rank

Arxiv

0+阅读 · 2022年7月24日

A Dual Accelerated Method for Online Stochastic Distributed Averaging: From Consensus to Decentralized Policy Evaluation

Arxiv

0+阅读 · 2022年7月23日

ERA: Expert Retrieval and Assembly for Early Action Prediction

Arxiv

0+阅读 · 2022年7月22日

Contrastive Learning with Complex Heterogeneity

Contrastive Learning with Complex Heterogeneity

Arxiv

0+阅读 · 2022年7月21日

Prompt Distribution Learning

Arxiv

14+阅读 · 2022年5月6日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

Towards Out-Of-Distribution Generalization: A Survey

Arxiv

38+阅读 · 2021年8月31日

Generative Models as a Data Source for Multiview Representation Learning

Arxiv

16+阅读 · 2021年6月9日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning

Arxiv

10+阅读 · 2018年4月11日

相关基金

肝癌活化的星状细胞促进树突状细胞DIgR2表达致肝癌免疫抑制的研究

国家自然科学基金

0+阅读 · 2014年12月31日

考虑观测值时空相关性的InSAR三维形变估计方法

国家自然科学基金

0+阅读 · 2013年12月31日

miR-449a调控Notch信号通路在移植物排斥中的作用和机制

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

自组装方法合成具有多级结构的介孔复合材料

国家自然科学基金

0+阅读 · 2012年12月31日

免疫细胞视频图像形变度量方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

长非编码RNA-UCA1调节循环肿瘤细胞抵抗NK细胞杀伤机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

浸润的树突状细胞和巨噬细胞表型和功能转化—#8212;尖锐湿疣免疫逃逸的新机制

国家自然科学基金

0+阅读 · 2009年12月31日

遍历哈密顿系统的谱理论

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员