Despite excellent performance in image generation, Generative Adversarial Networks (GANs) are notorious for its requirements of enormous storage and intensive computation. As an awesome ''performance maker'', knowledge distillation is demonstrated to be particularly efficacious in exploring low-priced GANs. In this paper, we investigate the irreplaceability of teacher discriminator and present an inventive discriminator-cooperated distillation, abbreviated as DCD, towards refining better feature maps from the generator. In contrast to conventional pixel-to-pixel match methods in feature map distillation, our DCD utilizes teacher discriminator as a transformation to drive intermediate results of the student generator to be perceptually close to corresponding outputs of the teacher generator. Furthermore, in order to mitigate mode collapse in GAN compression, we construct a collaborative adversarial training paradigm where the teacher discriminator is from scratch established to co-train with student generator in company with our DCD. Our DCD shows superior results compared with existing GAN compression methods. For instance, after reducing over 40x MACs and 80x parameters of CycleGAN, we well decrease FID metric from 61.53 to 48.24 while the current SoTA method merely has 51.92. This work's source code has been made accessible at https://github.com/poopit/DCD-official.
翻译:尽管在图像生成方面表现优异,但General Adversarial Networks(GANs)因其庞大的存储和密集计算的要求而臭名昭著。作为一个惊人的“性能制造者”,知识蒸馏在探索低价GANs时表现得特别有效。在本文中,我们调查了教师歧视者的不可替换性,并展示了一种创新式歧视者/合作蒸馏法,将其缩写成DCD,以完善来自发电机的更好的地貌地图。与地貌蒸馏中的常规像素至像素匹配方法相比,我们的DCD使用教师歧视器作为转换器,将学生发电机的中间结果推进到接近教师发电机的相应产出。此外,为了减轻GAN压缩中模式的崩溃,我们建立了一个协作性对抗性培训模式,教师歧视者从刮到与学生发电机一起与DCD公司连接,我们的DCD显示优于现有的GAN压缩方法。例如,在将40x MACMACs 和80x AS AS 标准从40x made the profileal rub/stal aslyGAAN has has be dromab.