Continually learning to segment more and more types of image regions is a desired capability for many intelligent systems. However, such continual semantic segmentation suffers from the same catastrophic forgetting issue as in continual classification learning. While multiple knowledge distillation strategies originally for continual classification have been well adapted to continual semantic segmentation, they only consider transferring old knowledge based on the outputs from one or more layers of deep fully convolutional networks. Different from existing solutions, this study proposes to transfer a new type of information relevant to knowledge, i.e. the relationships between elements (Eg. pixels or small local regions) within each image which can capture both within-class and between-class knowledge. The relationship information can be effectively obtained from the self-attention maps in a Transformer-style segmentation model. Considering that pixels belonging to the same class in each image often share similar visual properties, a class-specific region pooling is applied to provide more efficient relationship information for knowledge transfer. Extensive evaluations on multiple public benchmarks support that the proposed self-attention transfer method can further effectively alleviate the catastrophic forgetting issue, and its flexible combination with one or more widely adopted strategies significantly outperforms state-of-the-art solutions.
翻译:许多智能系统都希望不断学习将更多类型的图像区域分割开来,这是许多智能系统的一种理想能力。然而,这种持续的语义分割与持续分类学习一样,具有同样灾难性的遗忘问题。虽然最初用于连续分类的多种知识蒸馏战略已经很好地适应了连续的语义分割,但它们只考虑根据一个或一个以上层次的深层全演化网络的产出转让旧知识。本研究与现有的解决办法不同,建议转让与知识有关的新型信息,即每个图像中能够同时捕捉阶级内部和阶级间知识的元素(像素或小地方区域)之间的关系。在变形式分解模式中,可以从自我注意图中有效地获得关系信息。考虑到每个图像中属于同一等级的像素通常具有相似的视觉特性,因此,将一个特定类别区域集合起来,为知识转让提供更有效的关系信息。对多种公共基准进行广泛的评价,支持拟议的自留方法能够进一步有效减轻灾难性的遗忘问题,及其与一个或更广泛采用的战略的灵活组合。