This paper studies probability distributions of penultimate activations of classification networks. We show that, when a classification network is trained with the cross-entropy loss, its final classification layer forms a Generative-Discriminative pair with a generative classifier based on a specific distribution of penultimate activations. More importantly, the distribution is parameterized by the weights of the final fully-connected layer, and can be considered as a generative model that synthesizes the penultimate activations without feeding input data. We empirically demonstrate that this generative model enables stable knowledge distillation in the presence of domain shift, and can transfer knowledge from a classifier to variational autoencoders and generative adversarial networks for class-conditional image generation.
翻译:本文研究分类网络倒数第二次激活的概率分布。 我们显示,当对分类网络进行跨作物流失培训时,其最终分类层形成一个基因-分解配对,配有基于倒数第二次激活具体分布的基因分类。 更重要的是,该分布按最后完全连接层的重量参数进行参数化,可以被视为一种基因化模型,在不输入输入输入数据的情况下合成倒数第二次激活。 我们经验性地证明,这一基因化模型能够在域变换时实现稳定的知识蒸馏,并且能够将知识从分类到变异自动转换器和基因对抗网络,用于生成类条件图像。