Continual learning requires incremental compatibility with a sequence of tasks. However, the design of model architecture remains an open question: In general, learning all tasks with a shared set of parameters suffers from severe interference between tasks; while learning each task with a dedicated parameter subspace is limited by scalability. In this work, we theoretically analyze the generalization errors for learning plasticity and memory stability in continual learning, which can be uniformly upper-bounded by (1) discrepancy between task distributions, (2) flatness of loss landscape and (3) cover of parameter space. Then, inspired by the robust biological learning system that processes sequential experiences with multiple parallel compartments, we propose Cooperation of Small Continual Learners (CoSCL) as a general strategy for continual learning. Specifically, we present an architecture with a fixed number of narrower sub-networks to learn all incremental tasks in parallel, which can naturally reduce the two errors through improving the three components of the upper bound. To strengthen this advantage, we encourage to cooperate these sub-networks by penalizing the difference of predictions made by their feature representations. With a fixed parameter budget, CoSCL can improve a variety of representative continual learning approaches by a large margin (e.g., up to 10.64% on CIFAR-100-SC, 9.33% on CIFAR-100-RS, 11.45% on CUB-200-2011 and 6.72% on Tiny-ImageNet) and achieve the new state-of-the-art performance.
翻译:持续学习需要与一系列任务相匹配。然而,模型结构的设计仍是一个未决问题:一般来说,学习具有一套共同参数的所有任务都受到任务之间的严重干扰;学习每个任务时使用专用参数子空间受可缩放的限制。在这项工作中,我们从理论上分析在持续学习中学习可塑性和记忆稳定性的一般错误,这可以通过下列因素得到一致的上限:(1)任务分配差异,(2)损失分布的平坦度和(3)参数空间覆盖。随后,在强大的生物学习系统激励下,通过多个平行部门相继处理经验,我们提议小连续学习者合作(COSCL)作为持续学习的一般战略。具体地说,我们提出了一个结构,有一个固定数量的较窄子网络,以同时学习所有递增任务,这可以通过改进上层的三个组成部分而自然地减少两个错误。为了加强这一优势,我们鼓励合作这些子网络,通过纠正其特征展示所产生的不同预测。在固定的参数预算下,CSCL可以改进有代表性的C-64%-RB-100-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-IRC-M-IRC-I-I-I-IRC-IRC-M-IRC-M-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-