EGC: 通过单个能量模型进行图像生成和分类 (EGC: Image Generation and Classification via a Single Energy-Based Model)

Learning image classification and image generation using the same set of network parameters is a challenging problem. Recent advanced approaches perform well in one task often exhibit poor performance in the other. This work introduces an energy-based classifier and generator, namely EGC, which can achieve superior performance in both tasks using a single neural network. Unlike a conventional classifier that outputs a label given an image (i.e., a conditional distribution $p(y|\mathbf{x})$), the forward pass in EGC is a classifier that outputs a joint distribution $p(\mathbf{x},y)$, enabling an image generator in its backward pass by marginalizing out the label $y$. This is done by estimating the energy and classification probability given a noisy image in the forward pass, while denoising it using the score function estimated in the backward pass. EGC achieves competitive generation results compared with state-of-the-art approaches on ImageNet-1k, CelebA-HQ and LSUN Church, while achieving superior classification accuracy and robustness against adversarial attacks on CIFAR-10. This work represents the first successful attempt to simultaneously excel in both tasks using a single set of network parameters. We believe that EGC bridges the gap between discriminative and generative learning.

翻译：学习使用相同的网络参数进行图像分类和图像生成是一个具有挑战性的问题。最近的先进方法在一项任务中表现良好，但在另一项任务中表现不佳。本文介绍了一种基于能量的分类器和生成器，即 EGC，它可以使用单个神经网络在两项任务中实现优异的性能。与传统的分类器不同，传统分类器输出给定图像的标签（即条件分布 $p(y|\mathbf{x})$），EGC 中正向传递是分类器，输出联合分布 $p(\mathbf{x},y)$，通过在反向传递中消除标签 $y$ 来启用图像生成器。通过在正向传递中估计有噪声图像的能量和分类概率，同时使用反向传递中估计的得分函数对其进行降噪处理。EGC 在 ImageNet-1k、CelebA-HQ 和 LSUN Church 上达到了与最先进方法相媲美的生成结果，同时在 CIFAR-10 上具有更高的分类准确性和对抗攻击的鲁棒性。这项工作是首次尝试使用单个网络参数集同时在两个任务中表现出色的成功尝试。我们相信 EGC 弥合了判别性和生成性学习之间的差距。

相关内容

分类器

关注 6

分类是数据挖掘的一种非常重要的方法。分类的概念是在已有数据的基础上学会一个分类函数或构造出一个分类模型（即我们通常所说的分类器(Classifier)）。该函数或模型能够把数据库中的数据纪录映射到给定类别中的某一个，从而可以应用于数据预测。总之，分类器是数据挖掘中对样本进行分类的方法的统称，包含决策树、逻辑回归、朴素贝叶斯、神经网络等算法。

【CVPR 2022】使用多模态Transformer的端到端视频对象分割，End-to-End Referring Video Object Segmentation with Multimodal Transformer

专知会员服务

28+阅读 · 2022年3月3日

【伯克利】自回归模型的局部掩卷积，Locally Masked Convolution for Autoregressive Models

专知会员服务

20+阅读 · 2020年6月23日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日