Recent work has shown that a variety of semantics emerge in the latent space of Generative Adversarial Networks (GANs) when being trained to synthesize images. However, it is difficult to use these learned semantics for real image editing. A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code. However, existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space. As a result, the reconstructed image cannot well support semantic editing through varying the inverted code. To solve this problem, we propose an in-domain GAN inversion approach, which not only faithfully reconstructs the input image but also ensures the inverted code to be semantically meaningful for editing. We first learn a novel domain-guided encoder to project a given image to the native latent space of GANs. We then propose domain-regularized optimization by involving the encoder as a regularizer to fine-tune the code produced by the encoder and better recover the target image. Extensive experiments suggest that our inversion method achieves satisfying real image reconstruction and more importantly facilitates various image editing tasks, significantly outperforming start-of-the-arts.
翻译:最近的工作表明,在Generation Aversarial Network (GANs) 的潜藏空间里,在接受图像合成训练时出现了各种语义学。 然而,很难将这些学到的语义学用于真正的图像编辑。 向受过训练的 GAN 生成器输入真实图像的常见做法是将其反转为潜在代码。 但是, 现有的反向方法通常侧重于通过像素值重建目标图像, 但却未能在原始潜伏空间的语义域内放置反向代码。 因此, 重建后的图像无法很好地支持通过不同的反向代码进行语义编辑。 为了解决这个问题, 我们提议采用一种在内部使用 GAN 的语义学语义化方法, 这种方法不仅忠实地重建输入图像, 也确保反向代码返回到潜在代码中。 我们首先学习一种新颖的域制导编码, 将给定的代码投放到原始的潜藏空间中。 我们然后建议通过使用加密器来进行域正规化的优化, 来微调修改被翻转的代码。 我们建议进行更深入的图像改造, 重新改造。