Progress in continual reinforcement learning has been limited due to several barriers to entry: missing code, high compute requirements, and a lack of suitable benchmarks. In this work, we present CORA, a platform for Continual Reinforcement Learning Agents that provides benchmarks, baselines, and metrics in a single code package. The benchmarks we provide are designed to evaluate different aspects of the continual RL challenge, such as catastrophic forgetting, plasticity, ability to generalize, and sample-efficient learning. Three of the benchmarks utilize video game environments (Atari, Procgen, NetHack). The fourth benchmark, CHORES, consists of four different task sequences in a visually realistic home simulator, drawn from a diverse set of task and scene parameters. To compare continual RL methods on these benchmarks, we prepare three metrics in CORA: Continual Evaluation, Isolated Forgetting, and Zero-Shot Forward Transfer. Finally, CORA includes a set of performant, open-source baselines of existing algorithms for researchers to use and expand on. We release CORA and hope that the continual RL community can benefit from our contributions, to accelerate the development of new continual RL algorithms.
翻译:持续强化学习的进展有限,原因是进入障碍重重:缺少代码、高计算要求和缺乏适当基准。在这项工作中,我们介绍了CORA,这是一个连续强化学习工具平台,在单一代码包中提供基准、基线和衡量标准。我们提供的基准旨在评估持续RL挑战的不同方面,如灾难性遗忘、塑料、普及能力和抽样高效学习。三个基准利用了视频游戏环境(Atari、Procgen、NetHack)。第四个基准,CHORES,由从一套不同的任务和场景参数中提取的视觉现实型家居模拟器中的四种不同任务序列组成。为了比较这些基准的持续RL方法,我们编写了CORA的三个标准:持续评估、孤立的遗忘和零热向前传输。最后,CORA包括一套研究人员使用和扩大的现有算法的性能、开源基线。我们释放了CORA,希望不断更新的RL社区能够从我们的贡献中受益,以加速新的算法的发展。