CORA: 基准、基线和计量作为不断加强学习媒介的平台 (CORA: Benchmarks, Baselines, and Metrics as a Platform for Continual Reinforcement Learning Agents)

Progress in continual reinforcement learning has been limited due to several barriers to entry: missing code, high compute requirements, and a lack of suitable benchmarks. In this work, we present CORA, a platform for Continual Reinforcement Learning Agents that provides benchmarks, baselines, and metrics in a single code package. The benchmarks we provide are designed to evaluate different aspects of the continual RL challenge, such as catastrophic forgetting, plasticity, ability to generalize, and sample-efficient learning. Three of the benchmarks utilize video game environments (Atari, Procgen, NetHack). The fourth benchmark, CHORES, consists of four different task sequences in a visually realistic home simulator, drawn from a diverse set of task and scene parameters. To compare continual RL methods on these benchmarks, we prepare three metrics in CORA: continual evaluation, forgetting, and zero-shot forward transfer. Finally, CORA includes a set of performant, open-source baselines of existing algorithms for researchers to use and expand on. We release CORA and hope that the continual RL community can benefit from our contributions, to accelerate the development of new continual RL algorithms.

翻译：持续强化学习的进展有限,因为存在若干进入障碍:缺失代码、高计算要求和缺乏适当基准。在这项工作中,我们介绍了CORA,这是一个连续强化学习工具平台,提供单一代码包的基准、基线和衡量标准。我们提供的基准旨在评估持续RL挑战的不同方面,如灾难性遗忘、塑料、普及能力和抽样高效学习。三个基准利用视频游戏环境(Atari、Procgen、NetHack)。第四个基准,CHORES,由从一套不同的任务和场景参数中提取的视觉现实型家居模拟器中的四种不同任务序列组成。为了比较这些基准的持续RL方法,我们准备了CORA的三个尺度:持续评估、遗忘和零前方传输。最后,CORA包括一套研究人员使用和扩大现有算法的性能、开放源基线。我们释放了CORA,希望持续RL社区能够从我们的贡献中受益,以加快新的连续RL算法的发展。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

【DeepMind】基于模型的强化学习，174页ppt，Model-Based Reinforcement Learning

专知会员服务

89+阅读 · 2021年1月12日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

85+阅读 · 2020年2月18日

深度强化学习策略梯度教程，53页ppt

专知会员服务

184+阅读 · 2020年2月1日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日