A fundamental and challenging problem in deep learning is catastrophic forgetting, i.e. the tendency of neural networks to fail to preserve the knowledge acquired from old tasks when learning new tasks. This problem has been widely investigated in the research community and several Incremental Learning (IL) approaches have been proposed in the past years. While earlier works in computer vision have mostly focused on image classification and object detection, more recently some IL approaches for semantic segmentation have been introduced. These previous works showed that, despite its simplicity, knowledge distillation can be effectively employed to alleviate catastrophic forgetting. In this paper, we follow this research direction and, inspired by recent literature on contrastive learning, we propose a novel distillation framework, Uncertainty-aware Contrastive Distillation (\method). In a nutshell, \method~is operated by introducing a novel distillation loss that takes into account all the images in a mini-batch, enforcing similarity between features associated to all the pixels from the same classes, and pulling apart those corresponding to pixels from different classes. In order to mitigate catastrophic forgetting, we contrast features of the new model with features extracted by a frozen model learned at the previous incremental step. Our experimental results demonstrate the advantage of the proposed distillation technique, which can be used in synergy with previous IL approaches, and leads to state-of-art performance on three commonly adopted benchmarks for incremental semantic segmentation. The code is available at \url{https://github.com/ygjwd12345/UCD}.
翻译:深层学习中的一个根本性和具有挑战性的问题被灾难性地遗忘了, 即神经网络倾向于在学习新任务时无法保存从旧任务中获得的知识。 这个问题已经在研究界进行了广泛调查, 过去几年中提出了几种递增学习( IL) 方法。 虽然计算机视觉的早期工作主要侧重于图像分类和物体探测, 但最近引入了一些语言分类法的语义分解方法。 以前的这些工程表明, 尽管知识蒸馏方法简单, 但它可以有效地用于减轻灾难性的遗忘。 在本文中, 我们遵循了这一研究方向, 并且根据最近关于对比学习的文献, 我们提出了一个新的蒸馏框架, “ 不确定- 有意识的对比蒸馏( ILIL) 方法在过去几年中得到了广泛研究。 在螺旋中, 采用新的蒸馏方法, 将所有图像都纳入一个微型批量, 将所有与同一类的像素相关的特征相提炼, 并且将与不同类的像素基准分开。 为了减轻灾难性的记忆, 我们用新的蒸馏方法的特性与先前的变化模型相比, 我们用一个新的模型的变化方法的变化了。