Pre-trained representation is one of the key elements in the success of modern deep learning. However, existing works on continual learning methods have mostly focused on learning models incrementally from scratch. In this paper, we explore an alternative framework to incremental learning where we continually fine-tune the model from a pre-trained representation. Our method takes advantage of linearization technique of a pre-trained neural network for simple and effective continual learning. We show that this allows us to design a linear model where quadratic parameter regularization method is placed as the optimal continual learning policy, and at the same time enjoying the high performance of neural networks. We also show that the proposed algorithm enables parameter regularization methods to be applied to class-incremental problems. Additionally, we provide a theoretical reason why the existing parameter-space regularization algorithms such as EWC underperform on neural networks trained with cross-entropy loss. We show that the proposed method can prevent forgetting while achieving high continual fine-tuning performance on image classification tasks. To show that our method can be applied to general continual learning settings, we evaluate our method in data-incremental, task-incremental, and class-incremental learning problems.
翻译:培训前的表述方式是现代深层学习成功的关键因素之一。 但是,关于持续学习方法的现有工作主要侧重于从零开始逐步学习模式。 在本文中,我们探索了一种替代渐进学习的框架,我们不断从培训前的表述方式微调模型。我们的方法利用了受过培训的神经网络线性化技术,以简单和有效的持续学习。我们表明,这使我们能够设计一种线性模型,将四边参数规范化方法作为最佳的持续学习政策,同时享受神经网络的高性能。我们还表明,拟议的算法使得参数规范化方法能够适用于阶级问题。此外,我们从理论上解释了为什么现有的参数空间正规化算法,如EWC对受过跨机体损失训练的神经网络不完善。我们表明,拟议的方法可以防止遗忘,同时在图像分类任务上实现高水平的微调性表现。为了表明我们的方法可以应用于一般的持续学习环境,我们评估了我们的数据意识、任务、任务和课堂学习问题的方法。