视觉 RL 任务所需的健康引导 Q- 网络 (Look where you look! Saliency-guided Q-networks for visual RL tasks) - 专知论文

会员服务 ·

0

泛化理论 · Q网络` · Learning · 控制器 · 可辨认的 ·

2022 年 9 月 29 日

Look where you look! Saliency-guided Q-networks for visual RL tasks

翻译：视觉 RL 任务所需的健康引导 Q- 网络

David Bertoin,Adil Zouitine,Mehdi Zouitine,Emmanuel Rachelson

from arxiv, Thirty-sixth Conference on Neural Information Processing Systems (NeurIPS 2022), Nov 2022, New Orleans, United States

Deep reinforcement learning policies, despite their outstanding efficiency in simulated visual control tasks, have shown disappointing ability to generalize across disturbances in the input training images. Changes in image statistics or distracting background elements are pitfalls that prevent generalization and real-world applicability of such control policies. We elaborate on the intuition that a good visual policy should be able to identify which pixels are important for its decision, and preserve this identification of important sources of information across images. This implies that training of a policy with small generalization gap should focus on such important pixels and ignore the others. This leads to the introduction of saliency-guided Q-networks (SGQN), a generic method for visual reinforcement learning, that is compatible with any value function learning method. SGQN vastly improves the generalization capability of Soft Actor-Critic agents and outperforms existing stateof-the-art methods on the Deepmind Control Generalization benchmark, setting a new reference in terms of training efficiency, generalization gap, and policy interpretability.

翻译：深度强化学习政策尽管在模拟视觉控制任务方面非常有效,但显示出在投入培训图像的干扰中全面推广的能力令人失望。图像统计的变化或转移注意力的背景要素是防止这种控制政策普遍化和实际应用的隐患。我们详细阐述了这样的直觉,即良好的视觉政策应该能够确定哪些像素对其决定很重要,并保持这种对各种图像的重要信息来源的识别。这意味着,具有小范围概括化差距的政策培训应当侧重于如此重要的像素,而忽略其他。这导致引入了突出的以系统指导的Q网络(SGQN),这是视觉强化学习的一种通用方法,与任何价值函数学习方法相容。SGQN大大改进了Soft Actor-Critic 剂的概括化能力,并超越了现有关于深分化控制通用基准的先进方法,在培训效率、概括化差距和政策解释性方面确立了新的参考。

0

相关内容

泛化理论

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大神一年100篇论文

大神一年100篇论文

CreateAMind

15+阅读 · 2018年12月31日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

一类稳态Schödinger-Poisson-Slater方程标准化解的研究

国家自然科学基金

1+阅读 · 2015年12月31日

医学图像范例先验构造与虚拟多模态成像方法研究

国家自然科学基金

2+阅读 · 2014年12月31日

抽采过程中的瓦斯渗流及残余瓦斯赋存规律

国家自然科学基金

0+阅读 · 2014年12月31日

Forward-Looking与Backward-Looking相结合的投资组合管理

国家自然科学基金

1+阅读 · 2014年12月31日

番茄长链非编码RNA LeLNR1在抗TYLCV中的作用及抗凋亡机制解析

国家自然科学基金

0+阅读 · 2014年12月31日

岩石渗流应力耦合流变力学实验及本构模型研究

国家自然科学基金

0+阅读 · 2011年12月31日

渗流-应力耦合作用下深部岩石工程围岩的时效变形与细观失稳机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于NDVI的流域产流产沙对LUCC响应的快速预测及其尺度效应

国家自然科学基金

0+阅读 · 2009年12月31日

Ter94在Hedgehog信号转导途径中的作用机理

国家自然科学基金

0+阅读 · 2009年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

Unsupervised Visual Representation Learning via Mutual Information Regularized Assignment

Arxiv

0+阅读 · 2022年11月4日

Contrastive Value Learning: Implicit Models for Simple Offline RL

Arxiv

0+阅读 · 2022年11月3日

Leveraging Fully Observable Policies for Learning under Partial Observability

Arxiv

0+阅读 · 2022年11月3日

Physics-Guided Hierarchical Reward Mechanism for Learning-Based Object Grasping

Arxiv

0+阅读 · 2022年11月2日

Behavior Prior Representation learning for Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年11月2日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

44+阅读 · 2022年4月16日

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Arxiv

10+阅读 · 2021年2月11日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

Learning to Propagate for Graph Meta-Learning

Arxiv

14+阅读 · 2019年9月11日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

VIP会员

文章信息

相关主题

相关VIP内容

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

模型提取攻击与防御的系统综述：最新进展与展望

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

【CMU博士论文】用于物理模拟的高效深度学习模型

大模型解决方案白皮书：社交陪伴场景全流程落地指南

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大神一年100篇论文

大神一年100篇论文

CreateAMind

15+阅读 · 2018年12月31日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Unsupervised Visual Representation Learning via Mutual Information Regularized Assignment

Arxiv

0+阅读 · 2022年11月4日

Contrastive Value Learning: Implicit Models for Simple Offline RL

Arxiv

0+阅读 · 2022年11月3日

Leveraging Fully Observable Policies for Learning under Partial Observability

Arxiv

0+阅读 · 2022年11月3日

Physics-Guided Hierarchical Reward Mechanism for Learning-Based Object Grasping

Arxiv

0+阅读 · 2022年11月2日

Behavior Prior Representation learning for Offline Reinforcement Learning

Arxiv

0+阅读 · 2022年11月2日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

44+阅读 · 2022年4月16日

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Arxiv

10+阅读 · 2021年2月11日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

Learning to Propagate for Graph Meta-Learning

Arxiv

14+阅读 · 2019年9月11日

Event Extraction with Generative Adversarial Imitation Learning

Arxiv

13+阅读 · 2018年4月21日

相关基金

一类稳态Schödinger-Poisson-Slater方程标准化解的研究

国家自然科学基金

1+阅读 · 2015年12月31日

医学图像范例先验构造与虚拟多模态成像方法研究

国家自然科学基金

2+阅读 · 2014年12月31日

抽采过程中的瓦斯渗流及残余瓦斯赋存规律

国家自然科学基金

0+阅读 · 2014年12月31日

Forward-Looking与Backward-Looking相结合的投资组合管理

国家自然科学基金

1+阅读 · 2014年12月31日

番茄长链非编码RNA LeLNR1在抗TYLCV中的作用及抗凋亡机制解析

国家自然科学基金

0+阅读 · 2014年12月31日

岩石渗流应力耦合流变力学实验及本构模型研究

国家自然科学基金

0+阅读 · 2011年12月31日

渗流-应力耦合作用下深部岩石工程围岩的时效变形与细观失稳机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于NDVI的流域产流产沙对LUCC响应的快速预测及其尺度效应

国家自然科学基金

0+阅读 · 2009年12月31日

Ter94在Hedgehog信号转导途径中的作用机理

国家自然科学基金

0+阅读 · 2009年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员