While GAN is a powerful model for generating images, its inability to infer a latent space directly limits its use in applications requiring an encoder. Our paper presents a simple architectural setup that combines the generative capabilities of GAN with an encoder. We accomplish this by combining the encoder with the discriminator using shared weights, then training them simultaneously using a new loss term. We model the output of the encoder latent space via a GMM, which leads to both good clustering using this latent space and improved image generation by the GAN. Our framework is generic and can be easily plugged into any GAN strategy. In particular, we demonstrate it both with Vanilla GAN and Wasserstein GAN, where in both it leads to an improvement in the generated images in terms of both the IS and FID scores. Moreover, we show that our encoder learns a meaningful representation as its clustering results are competitive with the current GAN-based state-of-the-art in clustering.
翻译:虽然GAN是生成图像的强大模型,但它无法预测潜伏空间直接限制其在需要编码器的应用中的使用。 我们的论文展示了一个简单的建筑设置,将GAN和编码器的基因能力结合起来。 我们通过使用共享的重量将编码器与歧视器结合起来,然后用新的损失术语同时培训它们。 我们通过GMM模型模拟编码器潜伏空间的输出,这导致利用这一潜伏空间进行良好的组合,并使GAN改进图像生成。 我们的框架是通用的,可以很容易地插入到任何GAN战略中。 特别是,我们与Vanilla GAN和Wasserstein GAN一起展示了这一结构,在这两方面,它都使生成的图像在IS和FID的分数方面都得到了改进。 此外,我们表明我们的编码器通过GAN的组合结果具有竞争力,因为它的组合结果与目前GAN的GAN状态技术在集群中具有竞争力。