Generative Adversarial Networks (GANs) have been widely-used in image translation, but their high computational and storage costs impede the deployment on mobile devices. Prevalent methods for CNN compression cannot be directly applied to GANs due to the complicated generator architecture and the unstable adversarial training. To solve these, in this paper, we introduce a novel GAN compression method, termed DMAD, by proposing a Differentiable Mask and a co-Attention Distillation. The former searches for a light-weight generator architecture in a training-adaptive manner. To overcome channel inconsistency when pruning the residual connections, an adaptive cross-block group sparsity is further incorporated. The latter simultaneously distills informative attention maps from both the generator and discriminator of a pre-trained model to the searched generator, effectively stabilizing the adversarial training of our light-weight model. Experiments show that DMAD can reduce the Multiply Accumulate Operations (MACs) of CycleGAN by 13$\times$ and that of Pix2Pix by 4$\times$ while retaining a comparable performance against the full model. Our code can be available at https://github.com/SJLeo/DMAD.
翻译:在图像翻译中广泛使用生成自动网络(GANs),但其高计算和存储成本高,阻碍了移动设备上的部署。由于发电机结构复杂和对抗性培训不稳定,CNN压缩的先导方法无法直接应用于GANs。为了解决这些问题,我们在本文件中引入了新型GAN压缩方法,称为DMAD,方法是提出一种不同的遮罩和共同保持蒸馏法。以前以培训适应的方式搜索一个轻量发电机结构。在运行剩余连接时,要克服频道不一致的问题,将进一步纳入适应性跨区块组宽。后者同时向搜索的发电机从生成者和预培训模式的导师中提取信息关注图,有效地稳定我们轻量模型的对抗性培训。实验显示DMAD可以将MyleGAN的多重累积操作减少13美元\time,Pix的累积操作减少4美元,同时保留与完整模型的可比较性能。我们的代码可以在 MADMDM/DMS。