Existing literature in Continual Learning (CL) has focused on overcoming catastrophic forgetting, the inability of the learner to recall how to perform tasks observed in the past. There are however other desirable properties of a CL system, such as the ability to transfer knowledge from previous tasks and to scale memory and compute sub-linearly with the number of tasks. Since most current benchmarks focus only on forgetting using short streams of tasks, we first propose a new suite of benchmarks to probe CL algorithms across these new axes. Finally, we introduce a new modular architecture, whose modules represent atomic skills that can be composed to perform a certain task. Learning a task reduces to figuring out which past modules to re-use, and which new modules to instantiate to solve the current task. Our learning algorithm leverages a task-driven prior over the exponential search space of all possible ways to combine modules, enabling efficient learning on long streams of tasks. Our experiments show that this modular architecture and learning algorithm perform competitively on widely used CL benchmarks while yielding superior performance on the more challenging benchmarks we introduce in this work.
翻译:持续学习(CL)中的现有文献侧重于克服灾难性的遗忘,即学习者无法回忆过去所观察到的任务。然而,CL系统还有其他可取的特性,例如能够从先前的任务中转让知识,并缩放记忆和根据任务数量进行子线性计算。由于大多数现有基准仅侧重于忘记使用短期任务流,我们首先提出一套新的基准,以探究这些新轴的CL算法。最后,我们引入了一个新的模块架构,其模块代表原子技能,可以组成完成某种任务。学习一项任务后,可以确定哪些过去的模块可以重新使用,哪些新的模块可以即时解决当前任务。我们的学习算法在各种可能方式的指数搜索空间之前,利用任务驱动力将模块组合在一起,从而能够在长期任务流上有效学习。我们的实验表明,这种模块架构和学习算法在广泛使用 CL基准上具有竞争力,同时在这项工作中引入的更具挑战性的基准上取得优异性的业绩。