We propose an information-theoretic knowledge distillation approach for the compression of generative adversarial networks, which aims to maximize the mutual information between teacher and student networks via a variational optimization based on an energy-based model. Because the direct computation of the mutual information in continuous domains is intractable, our approach alternatively optimizes the student network by maximizing the variational lower bound of the mutual information. To achieve a tight lower bound, we introduce an energy-based model relying on a deep neural network to represent a flexible variational distribution that deals with high-dimensional images and consider spatial dependencies between pixels, effectively. Since the proposed method is a generic optimization algorithm, it can be conveniently incorporated into arbitrary generative adversarial networks and even dense prediction networks, e.g., image enhancement models. We demonstrate that the proposed algorithm achieves outstanding performance in model compression of generative adversarial networks consistently when combined with several existing models.
翻译:基于信息论的生成对抗网络压缩方法及其与变分基于能量模型的结合策略。“教师-学生”网络通过变分优化的方式,旨在最大程度地提高教师网络和学生网络之间的相互信息量。由于在连续域中直接计算互信息是不可行的,因此我们的方法通过优化相互信息的变分下界实现对学生网络的优化。为了实现紧致的下界,我们引入了一种基于能量的模型,依赖于深度神经网络来表示灵活的变分分布,以处理高维图像并考虑像素之间的空间依赖关系。由于所提出的方法是一种通用的优化算法,因此可以方便地将其集成到任意生成对抗网络和密集预测网络中,例如图像增强模型。我们证明了将所提出的算法与当前几个现有模型相结合可以在生成对抗网络的模型压缩方面实现出色的性能。