The ability to learn new concepts continually is necessary in this ever-changing world. However, deep neural networks suffer from catastrophic forgetting when learning new categories. Many works have been proposed to alleviate this phenomenon, whereas most of them either fall into the stability-plasticity dilemma or take too much computation or storage overhead. Inspired by the gradient boosting algorithm to gradually fit the residuals between the target model and the previous ensemble model, we propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively. Specifically, we first dynamically expand new modules to fit the residuals between the target and the output of the original model. Next, we remove redundant parameters and feature dimensions through an effective distillation strategy to maintain the single backbone model. We validate our method FOSTER on CIFAR-100 and ImageNet-100/1000 under different settings. Experimental results show that our method achieves state-of-the-art performance. Code is available at: https://github.com/G-U-N/ECCV22-FOSTER.
翻译:在这个不断变化的世界中,不断学习新概念的能力是必要的。然而,深神经网络在学习新类别时被灾难性地遗忘。许多工程被提出来缓解这一现象,而大部分工程要么陷入稳定-塑料困境,要么进行过多的计算或存储管理。受梯度推算法的启发,我们逐渐适应目标模型和上一个共同模型之间的剩余部分,我们提出了一个新型的两阶段学习模式FOSTER,授权模型适应性地学习新类别。具体地说,我们首先动态地扩大新模块,以适应目标与原始模型产出之间的剩余部分。接下来,我们通过有效的蒸馏战略来消除多余的参数和特征层面,以维持单一的主干模型。我们在不同环境下验证了我们的方法FOSTER在CIFAR-100上和图像Net-100/1000。实验结果显示,我们的方法达到了状态-艺术性能。代码可以在https://github.com/G-N/ECV22-FOSTER上查阅。