Knowledge Distillation (KD) is a strategy for the definition of a set of transferability gangways to improve the efficiency of Convolutional Neural Networks. Feature-based Knowledge Distillation is a subfield of KD that relies on intermediate network representations, either unaltered or depth-reduced via maximum activation maps, as the source knowledge. In this paper, we propose and analyse the use of a 2D frequency transform of the activation maps before transferring them. We pose that\textemdash by using global image cues rather than pixel estimates, this strategy enhances knowledge transferability in tasks such as scene recognition, defined by strong spatial and contextual relationships between multiple and varied concepts. To validate the proposed method, an extensive evaluation of the state-of-the-art in scene recognition is presented. Experimental results provide strong evidences that the proposed strategy enables the student network to better focus on the relevant image areas learnt by the teacher network, hence leading to better descriptive features and higher transferred performance than every other state-of-the-art alternative. We publicly release the training and evaluation framework used along this paper at http://www-vpu.eps.uam.es/publications/DCTBasedKDForSceneRecognition.
翻译:知识蒸馏(KD)是用来界定一套可转移性交织道的战略,以提高进化神经网络的效率。基于特性的知识蒸馏是KD的一个子领域,依靠中间网络代表,或未改变,或以最大激活图作为源知识深度减少。在本文件中,我们提议并分析在传输之前对启动地图进行 2D 频率转换的用途。我们通过使用全球图像提示而不是像素估计,呈现出这种可传输性。这个战略提高了现场识别等任务的知识可转移性,由多种概念和不同概念之间强大的空间和背景关系界定。为了验证拟议的方法,对现场识别中的最新技术进行了广泛的评价。实验结果提供了有力的证据,证明拟议的战略使学生网络能够更好地关注教师网络所学的有关图像领域,从而导致比其他所有状态-艺术替代方案更好的描述性特征和更高的传输性能。我们公开发布了在http://www-reneal-kovDFDC/squstificalations.我们发布了本文中所使用的培训和评价框架。