By default, neural networks learn on all training data at once. When such a model is trained on sequential chunks of new data, it tends to catastrophically forget how to handle old data. In this work we investigate how continual learners learn and forget representations. We observe two phenomena: knowledge accumulation, i.e. the improvement of a representation over time, and feature forgetting, i.e. the loss of task-specific representations. To better understand both phenomena, we introduce a new analysis technique called task exclusion comparison. If a model has seen a task and it has not forgotten all the task-specific features, then its representation for that task should be better than that of a model that was trained on similar tasks, but not that exact one. Our image classification experiments show that most task-specific features are quickly forgotten, in contrast to what has been suggested in the past. Further, we demonstrate how some continual learning methods, like replay, and ideas from representation learning affect a continually learned representation. We conclude by observing that representation quality is tightly correlated with continual learning performance.
翻译:默认情况下,神经网络会一次性学习所有训练数据。当这样的模型在顺序学习新数据块时,它往往会灾难性地遗忘如何处理旧数据。在本文中,我们研究了连续学习者如何学习和遗忘表示。我们观察到两种现象:知识积累,即表示随着时间的推移而改进;和特征遗忘,即任务特定表示的损失。为了更好地理解这两种现象,我们引入了一种名为任务排除比较的新分析技术。如果模型见过一个任务但没有遗忘所有任务特定的特征,则其对该任务的表示应该比仅在类似任务上训练过的模型更好。我们的图像分类实验表明,大多数任务特定特征很快就被遗忘了,这与过去的研究建议不同。此外,我们展示了回放等一些连续学习方法和表示学习理念如何影响持续学习的表示。最后,我们观察到表示质量与连续学习表现密切相关。