Contemporary neural networks are limited in their ability to learn from evolving streams of training data. When trained sequentially on new or evolving tasks, their accuracy drops sharply, making them unsuitable for many real-world applications. In this work, we shed light on the causes of this well-known yet unsolved phenomenon - often referred to as catastrophic forgetting - in a class-incremental setup. We show that a combination of simple components and a loss that balances intra-task and inter-task learning can already resolve forgetting to the same extent as more complex measures proposed in literature. Moreover, we identify poor quality of the learned representation as another reason for catastrophic forgetting in class-IL. We show that performance is correlated with secondary class information (dark knowledge) learned by the model and it can be improved by an appropriate regularizer. With these lessons learned, class-incremental learning results on CIFAR-100 and ImageNet improve over the state-of-the-art by a large margin, while keeping the approach simple.
翻译:当代神经网络从不断演变的培训数据流中学习的能力有限。 当按顺序对新任务或不断演变的任务进行训练时,其准确性会急剧下降,使其不适于许多现实世界应用。在这项工作中,我们揭示出这一众所周知但仍未解决的现象——通常被称为灾难性的遗忘——的根源,在阶级分级结构中。我们表明,将任务内和任务间学习相平衡的简单组成部分和损失结合起来,就能像文献中提议的更为复杂的措施一样解决忘记问题。此外,我们确认,学习不良的表述质量是导致在类L中灾难性地遗忘的另一个原因。我们表明,绩效与该模型所学的二级信息(黑暗知识)相关,可以通过适当的正规化工具加以改进。通过这些经验教训,CIFAR-100和图像网络的等级学习结果可以大大改善现状,同时保持方法的简单化。