Convolution Neural Networks (CNNs) have been used in various fields and are showing demonstrated excellent performance, especially in Single-Image Super Resolution (SISR). However, recently, CNN-based SISR has numerous parameters and computational costs for obtaining better performance. As one of the methods to make the network efficient, Knowledge Distillation (KD) which optimizes the performance trade-off by adding a loss term to the existing network architecture is currently being studied. KD for SISR is mainly proposed as a feature distillation (FD) to minimize L1-distance loss of feature maps between teacher and student networks, but it does not fully take into account the amount and importance of information that the student can accept. In this paper, we propose a feature-based adaptive contrastive distillation (FACD) method for efficiently training lightweight SISR networks. We show the limitations of the existing feature-distillation (FD) with L1-distance loss, and propose a feature-based contrastive loss that maximizes the mutual information between the feature maps of the teacher and student networks. The experimental results show that the proposed FACD improves not only the PSNR performance of the entire benchmark datasets and scales but also the subjective image quality compared to the conventional FD approach.
翻译:在许多领域,特别是在单一图像超级分辨率(SISR)中,运用了革命神经网络(CNN),显示成绩优异,但最近,有线电视新闻网的SISSR拥有许多参数和计算成本,以取得更好的业绩。作为提高网络效率的方法之一,目前正在研究知识蒸馏(KD),通过在现有网络结构中增加一个损失术语,优化业绩权衡。SISR的KD主要作为一种特性蒸馏(FD),以最大限度地减少教师和学生网络特征图的L1距离损失,但并未充分考虑到学生能够接受的信息的数量和重要性。在本文件中,我们提出了一种基于特性的适应性对比蒸馏(FACD)方法,以高效培训轻度的SISSR网络。我们显示了现有特征蒸馏(FD)与L1距离损失之间的局限性,并提出了基于特征的对比性损失,以最大限度地增加教师和学生网络特征图之间的相互信息。实验结果表明,拟议的FACD方法不仅改进了常规数据质量,而且改进了PSFD整个图像的尺度。