Generation of photo-realistic images, semantic editing and representation learning are a few of many potential applications of high resolution generative models. Recent progress in GANs have established them as an excellent choice for such tasks. However, since they do not provide an inference model, image editing or downstream tasks such as classification can not be done on real images using the GAN latent space. Despite numerous efforts to train an inference model or design an iterative method to invert a pre-trained generator, previous methods are dataset (e.g. human face images) and architecture (e.g. StyleGAN) specific. These methods are nontrivial to extend to novel datasets or architectures. We propose a general framework that is agnostic to architecture and datasets. Our key insight is that, by training the inference and the generative model together, we allow them to adapt to each other and to converge to a better quality model. Our \textbf{InvGAN}, short for Invertable GAN, successfully embeds real images to the latent space of a high quality generative model. This allows us to perform image inpainting, merging, interpolation and online data augmentation. We demonstrate this with extensive qualitative and quantitative experiments.
翻译:光现实图像的生成、语义编辑和演示学习是高分辨率基因化模型许多潜在应用的几种潜在应用。 GANs 的最近进展将它们确定为绝佳的任务选择。 但是,由于它们不提供推论模型、图像编辑或分类等下游任务, 无法利用GAN潜在空间在真实图像上完成分类。 尽管在培训推论模型或设计迭接方法以颠倒一个经过预先训练的生成器方面做了大量努力, 以往的方法是特定的数据集( 如人脸图像)和结构( 如StyleGAN) 。 这些方法并非技术性的, 无法扩展至新的数据集或结构。 我们建议了一个对架构和数据集具有不可知性的一般框架。 我们的主要洞察力是, 通过培训推论和基因化模型, 我们让它们相互适应, 并组合到一个更好的模型。 我们的 textbff{InvGAN}, 简略为GAN, 成功地将真实图像嵌入一个高品质基因化模型的隐蔽空间, 和定量模型, 我们得以在网络上进行广泛的图像整合。