不断加强学习的宿务世界模式的惊人效果 (The Surprising Effectiveness of Latent World Models for Continual Reinforcement Learning)

We study the use of model-based reinforcement learning methods, in particular, world models for continual reinforcement learning. In continual reinforcement learning, an agent is required to solve one task and then another sequentially while retaining performance and preventing forgetting on past tasks. World models offer a task-agnostic solution: they do not require knowledge of task changes. World models are a straight-forward baseline for continual reinforcement learning for three main reasons. Firstly, forgetting in the world model is prevented by persisting existing experience replay buffers across tasks, experience from previous tasks is replayed for learning the world model. Secondly, they are sample efficient. Thirdly and finally, they offer a task-agnostic exploration strategy through the uncertainty in the trajectories generated by the world model. We show that world models are a simple and effective continual reinforcement learning baseline. We study their effectiveness on Minigrid and Minihack continual reinforcement learning benchmarks and show that it outperforms state of the art task-agnostic continual reinforcement learning methods.

翻译：我们研究使用基于模型的强化学习方法,特别是世界不断强化学习模式。在持续强化学习过程中,需要一种代理机构在保持业绩和防止忘记过去任务的同时,按顺序完成一项任务,然后是另一项任务。世界模型提供了一个任务不可知的解决方案:它们不需要对任务变化的了解。世界模型是持续强化学习的一个直线前进基线,主要有三个原因。首先,世界模型的忘却因持续的现有经验在各项任务之间重新发挥缓冲作用而受阻,以往任务的经验被重新用来学习世界模型。第二,它们具有抽样效率。第三,最后,它们通过世界模型产生的轨迹的不确定性提供一项任务不可知性探索战略。我们表明,世界模型是一个简单而有效的持续强化学习基线。我们研究了其在Minigrid和Minihack持续强化学习基准上的有效性,并表明它超过了艺术任务不可知的持续强化学习方法。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

23+阅读 · 2022年3月19日

【伯克利JD Co-Reyes博士论文】建立强化学习算法泛化:从潜在动力学模型到元学习，Building Reinforcement Learning Algorithms that Generalize: From Latent Dynamics Models to Meta-Learning

专知会员服务

45+阅读 · 2022年3月6日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日