Self-supervised learning (SSL) has shown remarkable performance in computer vision tasks when trained offline. However, in a Continual Learning (CL) scenario where new data is introduced progressively, models still suffer from catastrophic forgetting. Retraining a model from scratch to adapt to newly generated data is time-consuming and inefficient. Previous approaches suggested re-purposing self-supervised objectives with knowledge distillation to mitigate forgetting across tasks, assuming that labels from all tasks are available during fine-tuning. In this paper, we generalize self-supervised continual learning in a practical setting where available labels can be leveraged in any step of the SSL process. With an increasing number of continual tasks, this offers more flexibility in the pre-training and fine-tuning phases. With Kaizen, we introduce a training architecture that is able to mitigate catastrophic forgetting for both the feature extractor and classifier with a carefully designed loss function. By using a set of comprehensive evaluation metrics reflecting different aspects of continual learning, we demonstrated that Kaizen significantly outperforms previous SSL models in competitive vision benchmarks, with up to 16.5% accuracy improvement on split CIFAR-100. Kaizen is able to balance the trade-off between knowledge retention and learning from new data with an end-to-end model, paving the way for practical deployment of continual learning systems.
翻译:自监督学习已经在离线的情况下在计算机视觉任务中显示出卓越的性能。然而,在逐步引入新数据的增量学习场景中,模型仍然会遭受灾难性的遗忘。从头开始重新训练模型以适应新生成的数据是耗费时间且效率低下的。以前的方法建议重用自监督目标,结合知识蒸馏以减轻跨任务的遗忘,假设标签在微调期间的所有任务中都是可用的。在本文中,我们将自监督增量学习推广到实际场景中,并允许在SSL过程的任何阶段利用可用标签。随着连续任务数量的增加,这在预训练和微调阶段提供了更大的灵活性。引入一种训练架构Kaizen,它能够通过设计仔细的损失函数来缓解特征提取器和分类器的灾难性遗忘。通过使用反映连续学习不同方面的一组全面的评估指标,我们证明Kaizen在竞争性视觉基准测试中明显优于以前的SSL模型,在分割CIFAR-100上的准确性提高了高达16.5%。 Kaizen能够通过端到端模型平衡知识保留和从新数据中学习之间的权衡,为增量学习系统的实际部署铺平了道路。