Knowledge distillation is widely adopted in semantic segmentation to reduce the computation cost.The previous knowledge distillation methods for semantic segmentation focus on pixel-wise feature alignment and intra-class feature variation distillation, neglecting to transfer the knowledge of the inter-class distance in the feature space, which is important for semantic segmentation. To address this issue, we propose an Inter-class Distance Distillation (IDD) method to transfer the inter-class distance in the feature space from the teacher network to the student network. Furthermore, semantic segmentation is a position-dependent task,thus we exploit a position information distillation module to help the student network encode more position information. Extensive experiments on three popular datasets: Cityscapes, Pascal VOC and ADE20K show that our method is helpful to improve the accuracy of semantic segmentation models and achieves the state-of-the-art performance. E.g. it boosts the benchmark model("PSPNet+ResNet18") by 7.50% in accuracy on the Cityscapes dataset.
翻译:在语义分离中广泛采用知识蒸馏法,以降低计算成本。 先前的语义分离法注重于像素特征对齐和类内特性变异蒸馏,忽视了在特征空间中传递阶级间距离知识,这对于语义分离十分重要。 为了解决这个问题,我们建议了一种跨类远程蒸馏法(IDD),将特征空间中的阶级间距离从教师网络转移到学生网络。此外,语义分离法是一项依赖位置的任务,因此我们利用定位信息蒸馏模块帮助学生网络对更多的位置信息进行编码。关于三个流行数据集:城市景观、帕斯卡尔VOC和ADE20K的广泛实验表明,我们的方法有助于提高语义分离模型的准确性,并实现状态的性能。 E.g. 它将基准模型(“PSPNet+ResNet18”)的精度提升到市景数据集的7.50%。