Deep transfer learning techniques try to tackle the limitations of deep learning, the dependency on extensive training data and the training costs, by reusing obtained knowledge. However, the current DTL techniques suffer from either catastrophic forgetting dilemma (losing the previously obtained knowledge) or overly biased pre-trained models (harder to adapt to target data) in finetuning pre-trained models or freezing a part of the pre-trained model, respectively. Progressive learning, a sub-category of DTL, reduces the effect of the overly biased model in the case of freezing earlier layers by adding a new layer to the end of a frozen pre-trained model. Even though it has been successful in many cases, it cannot yet handle distant source and target data. We propose a new continual/progressive learning approach for deep transfer learning to tackle these limitations. To avoid both catastrophic forgetting and overly biased-model problems, we expand the pre-trained model by expanding pre-trained layers (adding new nodes to each layer) in the model instead of only adding new layers. Hence the method is named EXPANSE. Our experimental results confirm that we can tackle distant source and target data using this technique. At the same time, the final model is still valid on the source data, achieving a promising deep continual learning approach. Moreover, we offer a new way of training deep learning models inspired by the human education system. We termed this two-step training: learning basics first, then adding complexities and uncertainties. The evaluation implies that the two-step training extracts more meaningful features and a finer basin on the error surface since it can achieve better accuracy in comparison to regular training. EXPANSE (model expansion and two-step training) is a systematic continual learning approach applicable to different problems and DL models.
翻译:深层转让学习技术试图通过重新使用获得的知识,解决深层次学习的局限性、对广泛培训数据的依赖以及培训成本等,然而,目前的DTL技术要么是灾难性的忘记(失去先前获得的知识)困境(失去先前获得的知识),要么是过度偏颇的预培训前模式(更难适应目标数据),分别在微调预培训前模式或冻结预培训模式的一部分方面,分别是过于偏颇的(更难适应目标数据),在深层次学习中,进步学习(DTL的亚类),通过在冻结前层时增加一个过于偏颇的偏差模式的影响,在冻结前层中,在冻结前层中增加一个新的层。我们的实验结果证实,尽管在很多情况下已经成功,它仍然无法处理远端来源和目标数据,而远端来源和目标数据则无法处理远端的源和目标数据。我们提出了一个新的持续学习模式,即:我们最终的学习模式是深层次的学习模式,我们最终的学习是学习。