Knowledge distillation has made remarkable achievements in model compression. However, most existing methods require the original training data, which is usually unavailable due to privacy and security issues. In this paper, we propose a conditional generative data-free knowledge distillation (CGDD) framework for training lightweight networks without any training data. This method realizes efficient knowledge distillation based on conditional image generation. Specifically, we treat the preset labels as ground truth to train a conditional generator in a semi-supervised manner. The trained generator can produce specified classes of training images. For training the student network, we force it to extract the knowledge hidden in teacher feature maps, which provide crucial cues for the learning process. Moreover, an adversarial training framework for promoting distillation performance is constructed by designing several loss functions. This framework helps the student model to explore larger data space. To demonstrate the effectiveness of the proposed method, we conduct extensive experiments on different datasets. Compared with other data-free works, our work obtains state-of-the-art results on CIFAR100, Caltech101, and different versions of ImageNet datasets. The codes will be released.
翻译:在模型压缩方面,知识蒸馏取得了显著的成就。然而,大多数现有方法都要求原始培训数据,而原始培训数据通常由于隐私和安全问题而无法获得。在本文中,我们提议为培训轻质网络建立一个有条件的基因化无数据蒸馏(CGDD)框架,无需任何培训数据。这种方法实现了基于有条件图像生成的高效知识蒸馏。具体地说,我们把预设标签作为地面真理来以半监督的方式培训一个有条件的发电机。经过培训的发电机可以制作特定的培训品级。为了培训学生网络,我们强迫它提取教师特征图中隐藏的知识,为学习过程提供关键提示。此外,通过设计若干损失功能来构建一个用于促进蒸馏绩效的对抗性培训框架。这一框架有助于学生模型探索更大的数据空间。为了证明拟议方法的有效性,我们在不同的数据集上进行了广泛的实验。与其他无数据的工作相比,我们的工作获得了关于CIFAR100、Caltech101和不同版本的图像网络数据集的最新结果。