深强化学习中的优先偏见 (The Primacy Bias in Deep Reinforcement Learning) - 专知论文

会员服务 ·

0

有偏 · 学成 · 深度强化学习 · 强化学习 · 可辨认的 ·

2022 年 5 月 16 日

The Primacy Bias in Deep Reinforcement Learning

翻译：深强化学习中的优先偏见

Evgenii Nikishin,Max Schwarzer,Pierluca D'Oro,Pierre-Luc Bacon,Aaron Courville

from arxiv, ICML 2022; code at https://github.com/evgenii-nikishin/rl_with_resets

This work identifies a common flaw of deep reinforcement learning (RL) algorithms: a tendency to rely on early interactions and ignore useful evidence encountered later. Because of training on progressively growing datasets, deep RL agents incur a risk of overfitting to earlier experiences, negatively affecting the rest of the learning process. Inspired by cognitive science, we refer to this effect as the primacy bias. Through a series of experiments, we dissect the algorithmic aspects of deep RL that exacerbate this bias. We then propose a simple yet generally-applicable mechanism that tackles the primacy bias by periodically resetting a part of the agent. We apply this mechanism to algorithms in both discrete (Atari 100k) and continuous action (DeepMind Control Suite) domains, consistently improving their performance.

翻译：这项工作确定了深层强化学习算法的一个常见缺陷:一种依赖早期互动和忽视后来发现的有用证据的倾向。由于关于逐步增加数据集的培训,深层RL代理商有过度适应早期经验的风险,对学习过程的其余部分产生消极影响。受认知科学的启发,我们将这种效果称为首要偏差。通过一系列实验,我们解析深层RL的算法方面加剧了这种偏差。然后我们提出一个简单但普遍适用的机制,通过定期重新设置代理商的一部分来解决首要偏差问题。我们将这一机制应用于离散(Atari 100k)和连续操作(DepMind Control Setage)领域的算法,并不断改进其性能。

0

相关内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【南洋理工大学课程】deep_reinforcement_learning（深度强化学习），109页ppt

【南洋理工大学课程】deep_reinforcement_learning（深度强化学习），109页ppt

专知会员服务

105+阅读 · 2019年11月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

非平衡等离子体对粪肠球菌生物膜QS-fsr系统的作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

EPA和DHA经Src 激酶调控中性粒细胞趋化反应介导的抗炎机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Pygo2在乳腺癌多药耐药中的作用及其上下游分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

一类Monge-Ampère方程解的边界行为

国家自然科学基金

0+阅读 · 2013年12月31日

以聚电解质-表面活性剂复合介晶为模板的多级孔材料合成及应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于TSV的三维集成方法中原子层沉积扩散阻挡层的制备与机理研究

国家自然科学基金

1+阅读 · 2012年12月31日

结晶辅助的嵌段共聚物在溶液中的自组装

国家自然科学基金

0+阅读 · 2009年12月31日

ECR-PLA等离子体的时空运动、激发增强和气相反应

国家自然科学基金

0+阅读 · 2008年12月31日

胰腺癌细胞中c-Src激酶调控Notch-1活化的分子机制

国家自然科学基金

0+阅读 · 2008年12月31日

用dsDNA微阵列筛选NF-κDNA靶点及靶基因

国家自然科学基金

0+阅读 · 2008年12月31日

CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年7月5日

Stabilizing Off-Policy Deep Reinforcement Learning from Pixels

Arxiv

0+阅读 · 2022年7月3日

Risk Perspective Exploration in Distributional Reinforcement Learning

Arxiv

0+阅读 · 2022年7月1日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

A Survey on Deep Reinforcement Learning for Data Processing and Analytics

Arxiv

24+阅读 · 2022年2月4日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

A Deep Reinforcement Learning Chatbot (Short Version)

Arxiv

13+阅读 · 2018年1月20日

VIP会员

文章信息

相关主题

深度强化学习

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

【南洋理工大学课程】deep_reinforcement_learning（深度强化学习），109页ppt

【南洋理工大学课程】deep_reinforcement_learning（深度强化学习），109页ppt

专知会员服务

105+阅读 · 2019年11月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能绝不能完全自主》

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

从数据到主导：AI与兵棋推演构筑决策优势

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年7月5日

Stabilizing Off-Policy Deep Reinforcement Learning from Pixels

Arxiv

0+阅读 · 2022年7月3日

Risk Perspective Exploration in Distributional Reinforcement Learning

Arxiv

0+阅读 · 2022年7月1日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

A Survey on Deep Reinforcement Learning for Data Processing and Analytics

Arxiv

24+阅读 · 2022年2月4日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

Arxiv

20+阅读 · 2020年3月10日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

A Deep Reinforcement Learning Chatbot (Short Version)

Arxiv

13+阅读 · 2018年1月20日

相关基金

非平衡等离子体对粪肠球菌生物膜QS-fsr系统的作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

EPA和DHA经Src 激酶调控中性粒细胞趋化反应介导的抗炎机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Pygo2在乳腺癌多药耐药中的作用及其上下游分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

一类Monge-Ampère方程解的边界行为

国家自然科学基金

0+阅读 · 2013年12月31日

以聚电解质-表面活性剂复合介晶为模板的多级孔材料合成及应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于TSV的三维集成方法中原子层沉积扩散阻挡层的制备与机理研究

国家自然科学基金

1+阅读 · 2012年12月31日

结晶辅助的嵌段共聚物在溶液中的自组装

国家自然科学基金

0+阅读 · 2009年12月31日

ECR-PLA等离子体的时空运动、激发增强和气相反应

国家自然科学基金

0+阅读 · 2008年12月31日

胰腺癌细胞中c-Src激酶调控Notch-1活化的分子机制

国家自然科学基金

0+阅读 · 2008年12月31日

用dsDNA微阵列筛选NF-κDNA靶点及靶基因

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员