Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.
翻译:人造神经网络在解决特定僵硬任务的分类问题方面蓬勃发展,在不同的培训阶段通过普遍学习行为获得知识。由此形成的网络类似于一个静态的知识实体,努力扩展这种知识,而没有针对最初的任务,导致灾难性的遗忘。持续学习将这一范式转向能够不断积累不同任务的知识而无需从零开始重新培训的网络。我们注重任务递增分类,任务按顺序到达,并按明确的界限划分。我们的主要贡献涉及:(1) 分类和广泛概述最新技术,(2) 一个不断确定持续学习者稳定-固定性交换的新框架,(3) 对11个最先进的持续学习方法和4个基线进行全面的实验性比较。我们从经验上审视三个基准的优点和弱点,考虑小图像网和大规模不平衡的饱和表以及识别数据集的顺序。我们研究了模型能力、重量衰减和辍学调整的影响,以及任务的排列顺序,以及所需记忆、计算时间和储存的质量比较方法。