The continual learning (CL) paradigm aims to enable neural networks to learn tasks continually in a sequential fashion. The fundamental challenge in this learning paradigm is catastrophic forgetting previously learned tasks when the model is optimized for a new task, especially when their data is not accessible. Current architectural-based methods aim at alleviating the catastrophic forgetting problem but at the expense of expanding the capacity of the model. Regularization-based methods maintain a fixed model capacity; however, previous studies showed the huge performance degradation of these methods when the task identity is not available during inference (e.g. class incremental learning scenario). In this work, we propose a novel architectural-based method referred as SpaceNet for class incremental learning scenario where we utilize the available fixed capacity of the model intelligently. SpaceNet trains sparse deep neural networks from scratch in an adaptive way that compresses the sparse connections of each task in a compact number of neurons. The adaptive training of the sparse connections results in sparse representations that reduce the interference between the tasks. Experimental results show the robustness of our proposed method against catastrophic forgetting old tasks and the efficiency of SpaceNet in utilizing the available capacity of the model, leaving space for more tasks to be learned. In particular, when SpaceNet is tested on the well-known benchmarks for CL: split MNIST, split Fashion-MNIST, and CIFAR-10/100, it outperforms regularization-based methods by a big performance gap. Moreover, it achieves better performance than architectural-based methods without model expansion and achieved comparable results with rehearsal-based methods, while offering a huge memory reduction.
翻译:持续学习(CL)模式旨在让神经网络能够连续不断地学习任务。这一学习模式的基本挑战在于,当模型优化后,灾难性地忘记了以前学到的任务,特别是当其数据无法获取时,这一模式将被用于新的任务。目前的建筑型方法旨在缓解灾难性的忘记问题,但以扩大模型能力为代价。基于常规化的方法保持固定的模型能力;然而,以往的研究显示,当推论期间无法提供任务特性时,这些方法的性能会大幅退化(例如,阶级递增学习情景)。在这项工作中,我们提出了一种新的建筑型方法,称为SpaceNet,用于课堂递增学习情景,即我们利用模型的现有固定能力,智能智能地利用模型的现有固定能力。目前基于建筑型方法旨在缓解灾难性的忘记问题,但以牺牲模型能力为目的,以适应方式将稀疏的深层神经网络从零散连接起来,从而缩小了每个任务之间的干扰。 实验结果表明,我们所提议的方法与灾难性的旧任务和空间网在利用现有模型的能力方面的效率是稳健健的,使空间在C-FMIS-S-S-S-S-R-S-S-S-S-S-S-S-S-S-SAR-SAR-SAR-S-S-S-S-SAR-SAR-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-SAR-S-S-SAR-S-SAR-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-