Tabula Rasa 之后:再生强化学习 (Beyond Tabula Rasa: Reincarnating Reinforcement Learning) - 专知论文

会员服务 ·

0

Learning · Agent · Ad hoc · 强化学习 · 知识 (knowledge) ·

2022 年 6 月 3 日

Beyond Tabula Rasa: Reincarnating Reinforcement Learning

翻译：Tabula Rasa 之后:再生强化学习

Rishabh Agarwal,Max Schwarzer,Pablo Samuel Castro,Aaron Courville,Marc G. Bellemare

Learning tabula rasa, that is without any prior knowledge, is the prevalent workflow in reinforcement learning (RL) research. However, RL systems, when applied to large-scale settings, rarely operate tabula rasa. Such large-scale systems undergo multiple design or algorithmic changes during their development cycle and use ad hoc approaches for incorporating these changes without re-training from scratch, which would have been prohibitively expensive. Additionally, the inefficiency of deep RL typically excludes researchers without access to industrial-scale resources from tackling computationally-demanding problems. To address these issues, we present reincarnating RL as an alternative workflow, where prior computational work (e.g., learned policies) is reused or transferred between design iterations of an RL agent, or from one RL agent to another. As a step towards enabling reincarnating RL from any agent to any other agent, we focus on the specific setting of efficiently transferring an existing sub-optimal policy to a standalone value-based RL agent. We find that existing approaches fail in this setting and propose a simple algorithm to address their limitations. Equipped with this algorithm, we demonstrate reincarnating RL's gains over tabula rasa RL on Atari 2600 games, a challenging locomotion task, and the real-world problem of navigating stratospheric balloons. Overall, this work argues for an alternative approach to RL research, which we believe could significantly improve real-world RL adoption and help democratize it further.

翻译：没有任何事先知识的学习塔路拉萨是强化学习(RL)研究中普遍存在的工作流程。然而,如果应用到大型环境,RL系统很少操作塔路拉萨。这种大型系统在开发周期内经历了多重设计或算法变化,并且使用临时方法将这些变化纳入其中,而无需从头再培训,这代价太高。此外,深LL效率低下,通常使没有获得工业规模资源的研究者无法处理计算需求问题。为了解决这些问题,我们提出将RL作为替代工作流程,在以前计算工作(例如,学习过的政策)被重新利用或从一个RL代理的设计迭代之间转移,或从一个RL代理向另一个代理转移。作为使RL从任何代理向任何其他代理重新吸收障碍的一个步骤,我们侧重于将现有的亚最佳政策有效地转移到一个独立的基于价值的RL代理。我们发现,在这种设置中,我们没有提出一种简单的替代算法,以克服其成本的RAVL的难度。Atrequestrual,我们用一个具有挑战性的任务,我们展示了一种真实的Ralal-L listal labal labal labal comalalalalal,我们发现,我们无法将一个真正的任务重新定位,我们无法将一个真正的任务转换成一个真正的任务。

0

相关内容

Learning

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【O'Reilly AI Conference 2019】使用机器学习和开源工具构建上下文AI助手（Building contextual AI assistants with machine learning and open source tools），Rasa产品经理Tyler Dunn

【O'Reilly AI Conference 2019】使用机器学习和开源工具构建上下文AI助手（Building contextual AI assistants with machine learning and open source tools），Rasa产品经理Tyler Dunn

专知会员服务

18+阅读 · 2019年11月5日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

B与H离子共注入剥离SiC晶体波导特性的研究

国家自然科学基金

0+阅读 · 2015年12月31日

函数空间中关于积分算子的Wiener引理及有界性的研究

国家自然科学基金

1+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

一类Monge-Ampère方程解的边界行为

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

中国碎米蕨类（cheilanthoid ferns）的系统学研究

国家自然科学基金

0+阅读 · 2012年12月31日

细晶Ni-Mn-Ga-Gd合金薄膜马氏体相变的尺寸效应与高温形状记忆特性

国家自然科学基金

0+阅读 · 2012年12月31日

多元氧化物高介电系数介质材料在电荷存储器件中的应用研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于T-S模型的连续非线性系统和随机系统的分析和设计

国家自然科学基金

0+阅读 · 2009年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

Actor-Critic based Improper Reinforcement Learning

Arxiv

0+阅读 · 2022年7月19日

Active Exploration for Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年7月18日

MAD for Robust Reinforcement Learning in Machine Translation

Arxiv

0+阅读 · 2022年7月18日

FRAS: Federated Reinforcement Learning empowered Adaptive Point Cloud Video Streaming

Arxiv

0+阅读 · 2022年7月18日

Dynamic Bipedal Maneuvers through Sim-to-Real Reinforcement Learning

Arxiv

0+阅读 · 2022年7月16日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

【O'Reilly AI Conference 2019】使用机器学习和开源工具构建上下文AI助手（Building contextual AI assistants with machine learning and open source tools），Rasa产品经理Tyler Dunn

【O'Reilly AI Conference 2019】使用机器学习和开源工具构建上下文AI助手（Building contextual AI assistants with machine learning and open source tools），Rasa产品经理Tyler Dunn

专知会员服务

18+阅读 · 2019年11月5日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Actor-Critic based Improper Reinforcement Learning

Arxiv

0+阅读 · 2022年7月19日

Active Exploration for Inverse Reinforcement Learning

Arxiv

0+阅读 · 2022年7月18日

MAD for Robust Reinforcement Learning in Machine Translation

Arxiv

0+阅读 · 2022年7月18日

FRAS: Federated Reinforcement Learning empowered Adaptive Point Cloud Video Streaming

Arxiv

0+阅读 · 2022年7月18日

Dynamic Bipedal Maneuvers through Sim-to-Real Reinforcement Learning

Arxiv

0+阅读 · 2022年7月16日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Arxiv

12+阅读 · 2021年2月7日

Transfer Learning in Deep Reinforcement Learning: A Survey

Transfer Learning in Deep Reinforcement Learning: A Survey

Arxiv

23+阅读 · 2020年9月16日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

B与H离子共注入剥离SiC晶体波导特性的研究

国家自然科学基金

0+阅读 · 2015年12月31日

函数空间中关于积分算子的Wiener引理及有界性的研究

国家自然科学基金

1+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

一类Monge-Ampère方程解的边界行为

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

中国碎米蕨类（cheilanthoid ferns）的系统学研究

国家自然科学基金

0+阅读 · 2012年12月31日

细晶Ni-Mn-Ga-Gd合金薄膜马氏体相变的尺寸效应与高温形状记忆特性

国家自然科学基金

0+阅读 · 2012年12月31日

多元氧化物高介电系数介质材料在电荷存储器件中的应用研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于T-S模型的连续非线性系统和随机系统的分析和设计

国家自然科学基金

0+阅读 · 2009年12月31日

p进表示的伽罗瓦上同调

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员