Frustratingly Easy Regularization on Representation Can Boost Deep Reinforcement Learning - 专知论文

会员服务 ·

0

正则化项 · Learning · 表示 · Performer · Boosting（一种模型训练加速方式） ·

2023 年 4 月 23 日

Frustratingly Easy Regularization on Representation Can Boost Deep Reinforcement Learning

翻译：暂无翻译

Qiang He,Huangyuan Su,Jieyu Zhang,Xinwen Hou

from arxiv, Accepted to CVPR23. Website: https://sites.google.com/view/peer-cvpr2023/

Deep reinforcement learning (DRL) gives the promise that an agent learns good policy from high-dimensional information, whereas representation learning removes irrelevant and redundant information and retains pertinent information. In this work, we demonstrate that the learned representation of the $Q$-network and its target $Q$-network should, in theory, satisfy a favorable distinguishable representation property. Specifically, there exists an upper bound on the representation similarity of the value functions of two adjacent time steps in a typical DRL setting. However, through illustrative experiments, we show that the learned DRL agent may violate this property and lead to a sub-optimal policy. Therefore, we propose a simple yet effective regularizer called Policy Evaluation with Easy Regularization on Representation (PEER), which aims to maintain the distinguishable representation property via explicit regularization on internal representations. And we provide the convergence rate guarantee of PEER. Implementing PEER requires only one line of code. Our experiments demonstrate that incorporating PEER into DRL can significantly improve performance and sample efficiency. Comprehensive experiments show that PEER achieves state-of-the-art performance on all 4 environments on PyBullet, 9 out of 12 tasks on DMControl, and 19 out of 26 games on Atari. To the best of our knowledge, PEER is the first work to study the inherent representation property of Q-network and its target. Our code is available at https://sites.google.com/view/peer-cvpr2023/.

翻译：暂无翻译

0

相关内容

正则化项

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

宽谱敏化的Yb3+掺杂硅(氧)氮化物制备与近红外发光性能

国家自然科学基金

0+阅读 · 2014年12月31日

光束在PT对称光子晶格中的线性动力学特性

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

间断Galerkin方法及在电磁计算中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

掺杂超冷原子气体的低能激发

国家自然科学基金

0+阅读 · 2012年12月31日

基于李雅普诺夫稳定性理论的含超导量子比特开放量子系统最优控制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

三维流形上的Heegaard分解及其在纽结理论中应用

国家自然科学基金

0+阅读 · 2011年12月31日

量子比特的声子效应

国家自然科学基金

0+阅读 · 2009年12月31日

基于Al结的超导量子比特的制备和宏观量子性质研究

国家自然科学基金

0+阅读 · 2009年12月31日

超导磁通型量子比特的耦合及绝热量子计算的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Reinforcement Learning Based Power Control for Reliable Mission-Critical Wireless Transmission

Arxiv

0+阅读 · 2023年6月8日

Generalization Across Observation Shifts in Reinforcement Learning

Arxiv

0+阅读 · 2023年6月7日

Learning to Do or Learning While Doing: Reinforcement Learning and Bayesian Optimisation for Online Continuous Tuning

Arxiv

0+阅读 · 2023年6月6日

Enabling Efficient Interaction between an Algorithm Agent and an LLM: A Reinforcement Learning Approach

Arxiv

0+阅读 · 2023年6月6日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

A Survey on Transformers in Reinforcement Learning

Arxiv

31+阅读 · 2023年1月8日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Deep Learning for Learning Graph Representations

Arxiv

35+阅读 · 2020年1月2日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

VIP会员

文章信息

相关主题

Boosting（一种模型训练加速方式）

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

开源书：PyTorch深度学习起步

开源书：PyTorch深度学习起步

专知会员服务

51+阅读 · 2019年10月11日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《精确反蜂群防御系统：三维运动探测与定向空爆拦截技术融合》最新24页

地下战：地下空间的战略博弈

《无人机战争时代的战时法：大国竞争中的区分原则、相称性原则与行动建议》最新75页

《构建强健军事力量的设计挑战：提升海军兵力支持系统效能的多分辨率建模方法》69页

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

相关论文

Reinforcement Learning Based Power Control for Reliable Mission-Critical Wireless Transmission

Arxiv

0+阅读 · 2023年6月8日

Generalization Across Observation Shifts in Reinforcement Learning

Arxiv

0+阅读 · 2023年6月7日

Learning to Do or Learning While Doing: Reinforcement Learning and Bayesian Optimisation for Online Continuous Tuning

Arxiv

0+阅读 · 2023年6月6日

Enabling Efficient Interaction between an Algorithm Agent and an LLM: A Reinforcement Learning Approach

Arxiv

0+阅读 · 2023年6月6日

A Survey on Causal Reinforcement Learning

Arxiv

29+阅读 · 2023年2月10日

A Survey on Transformers in Reinforcement Learning

Arxiv

31+阅读 · 2023年1月8日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Deep Learning for Learning Graph Representations

Arxiv

35+阅读 · 2020年1月2日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Deep Reinforcement Learning for List-wise Recommendations

Arxiv

13+阅读 · 2018年1月5日

相关基金

宽谱敏化的Yb3+掺杂硅(氧)氮化物制备与近红外发光性能

国家自然科学基金

0+阅读 · 2014年12月31日

光束在PT对称光子晶格中的线性动力学特性

国家自然科学基金

0+阅读 · 2013年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

间断Galerkin方法及在电磁计算中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

掺杂超冷原子气体的低能激发

国家自然科学基金

0+阅读 · 2012年12月31日

基于李雅普诺夫稳定性理论的含超导量子比特开放量子系统最优控制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

三维流形上的Heegaard分解及其在纽结理论中应用

国家自然科学基金

0+阅读 · 2011年12月31日

量子比特的声子效应

国家自然科学基金

0+阅读 · 2009年12月31日

基于Al结的超导量子比特的制备和宏观量子性质研究

国家自然科学基金

0+阅读 · 2009年12月31日

超导磁通型量子比特的耦合及绝热量子计算的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员