Recently, CNN-based SISR has numerous parameters and high computational cost to achieve better performance, limiting its applicability to resource-constrained devices such as mobile. As one of the methods to make the network efficient, Knowledge Distillation (KD), which transfers teacher's useful knowledge to student, is currently being studied. More recently, KD for SISR utilizes Feature Distillation (FD) to minimize the Euclidean distance loss of feature maps between teacher and student networks, but it does not sufficiently consider how to effectively and meaningfully deliver knowledge from teacher to improve the student performance at given network capacity constraints. In this paper, we propose a feature-domain adaptive contrastive distillation (FACD) method for efficiently training lightweight student SISR networks. We show the limitations of the existing FD methods using Euclidean distance loss, and propose a feature-domain contrastive loss that makes a student network learn richer information from the teacher's representation in the feature domain. In addition, we propose an adaptive distillation that selectively applies distillation depending on the conditions of the training patches. The experimental results show that the student EDSR and RCAN networks with the proposed FACD scheme improves not only the PSNR performance of the entire benchmark datasets and scales, but also the subjective image quality compared to the conventional FD approaches.
翻译:最近,基于CNN的超分辨率方法为了获得更好的性能具有众多参数和高计算成本,限制了其在移动等资源受限设备上的应用。作为使网络高效的方法之一,已经研究了知识蒸馏(KD)将教师的相关知识转移到学生中。最近,KD for SISR利用特征蒸馏(FD)来最小化教师和学生网络之间特征图之间的欧几里得距离损失,但它没有充分考虑如何有效地和有意义地从教师那里传递知识以在给定的网络容量约束下提高学生的性能。在本文中,我们提出一种特征域自适应对比蒸馏(FACD)方法,用于有效训练轻量级学生SISR网络。我们展示了使用欧几里得距离损失的现有FD方法的局限性,并提出了一种特征域对比损失,使学生网络从教师表征中学习更丰富的信息。此外,我们提出一种自适应蒸馏,根据训练数据块的情况有选择性地应用蒸馏。实验结果表明,使用所提出的FACD方案的学生EDSR和RCAN网络不仅提高了整个基准数据集和尺度的PSNR性能,而且与常规FD方法相比,还提高了主观图像质量。