Generation of photo-realistic images, semantic editing and representation learning are a few of many potential applications of high resolution generative models. Recent progress in GANs have established them as an excellent choice for such tasks. However, since they do not provide an inference model, image editing or downstream tasks such as classification can not be done on real images using the GAN latent space. Despite numerous efforts to train an inference model or design an iterative method to invert a pre-trained generator, previous methods are dataset (e.g. human face images) and architecture (e.g. StyleGAN) specific. These methods are nontrivial to extend to novel datasets or architectures. We propose a general framework that is agnostic to architecture and datasets. Our key insight is that, by training the inference and the generative model together, we allow them to adapt to each other and to converge to a better quality model. Our \textbf{InvGAN}, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model. This allows us to perform image inpainting, merging, interpolation and online data augmentation. We demonstrate this with extensive qualitative and quantitative experiments.
翻译:光现实图像的生成、语义编辑和演示学习是高分辨率基因化模型许多潜在应用的几种潜在应用。 GANs 的最近进展将这些方法确定为绝佳的任务选择。 但是,由于它们不提供推论模型、图像编辑或分类等下游任务, 无法使用GAN潜质空间在真实图像上完成分类。 尽管在培训推论模型或设计迭接方法以颠倒预修过的生成器方面做了大量努力, 以往的方法是特定的数据集( 如人脸图像)和结构( 如StyleGAN) 。 这些方法并不具有技术性, 无法扩展至新的数据集或结构。 我们建议了一个对架构和数据集具有不可知性的一般框架。 我们的主要洞察力是, 通过培训推论和基因化模型, 我们让它们相互适应, 并与一个更高质量的模型相融合。 我们的 textbff{InvGAN}, 简略的GAN, 将真实图像成功地嵌入一个高品质基因化模型的隐性空间, 我们得以在网上进行高质化和定性的实验中进行图像的整合。