In this work, we explore data augmentations for knowledge distillation on semantic segmentation. To avoid over-fitting to the noise in the teacher network, a large number of training examples is essential for knowledge distillation. Imagelevel argumentation techniques like flipping, translation or rotation are widely used in previous knowledge distillation framework. Inspired by the recent progress on semantic directions on feature-space, we propose to include augmentations in feature space for efficient distillation. Specifically, given a semantic direction, an infinite number of augmentations can be obtained for the student in the feature space. Furthermore, the analysis shows that those augmentations can be optimized simultaneously by minimizing an upper bound for the losses defined by augmentations. Based on the observation, a new algorithm is developed for knowledge distillation in semantic segmentation. Extensive experiments on four semantic segmentation benchmarks demonstrate that the proposed method can boost the performance of current knowledge distillation methods without any significant overhead. Code is available at: https://github.com/jianlong-yuan/FAKD.
翻译:在这项工作中,我们探索了数据增强,用于对语义部分进行知识蒸馏。为了避免过度适应教师网络的噪音,大量培训范例对于知识蒸馏至关重要。在先前的知识蒸馏框架中,广泛使用了翻转、翻译或旋转等图像层面的论证技术。在地物空间语义方向最近的进展的启发下,我们提议在地物空间中添加增强值,以便高效蒸馏。具体地说,鉴于语义方向,可以为地物空间的学生获取无限数量的增强值。此外,分析表明,通过最大限度地减少增强值所定义的损失的上限,这些增强值可以同时优化。根据观察,为在语义部分中进行知识蒸馏开发了一种新的算法。在四个语义部分基准上的广泛实验表明,拟议的方法可以提高当前知识蒸馏方法的性能,而无需任何重大管理。代码可查到:https://github.com/jianlong-yuan/FAKD。