StyleGAN2 was demonstrated to be a powerful image generation engine that supports semantic editing. However, in order to manipulate a real-world image, one first needs to be able to retrieve its corresponding latent representation in StyleGAN's latent space that is decoded to an image as close as possible to the desired image. For many real-world images, a latent representation does not exist, which necessitates the tuning of the generator network. We present a per-image optimization method that tunes a StyleGAN2 generator such that it achieves a local edit to the generator's weights, resulting in almost perfect inversion, while still allowing image editing, by keeping the rest of the mapping between an input latent representation tensor and an output image relatively intact. The method is based on a one-shot training of a set of shallow update networks (aka. Gradient Modification Modules) that modify the layers of the generator. After training the Gradient Modification Modules, a modified generator is obtained by a single application of these networks to the original parameters, and the previous editing capabilities of the generator are maintained. Our experiments show a sizable gap in performance over the current state of the art in this very active domain. Our code is available at \url{https://github.com/sheffier/gani}.
翻译:StyleGAN2 被证明是一个强大的图像生成引擎,支持语义编辑。然而,为了操作真实世界的图像,首先需要能够在 StyleGAN 的潜层空间中检索其相应的潜在代表面,该潜在代表面的解码尽可能接近理想图像。对于许多真实世界的图像来说,不存在潜在代表面,这需要调整发电机网络。我们提出了一个每个图像优化方法,对StyleGAN2 的生成器进行调控,使其对发电机的重量进行本地编辑,从而导致几乎完美的转换,同时仍然允许图像编辑,方法是将输入潜代表面代表面表达面与输出图像相对完整地保持其余的映像。该方法基于一组浅更新网络(aka. Gradient 调整模块)的一次性培训。在培训了梯度调整模块后,通过这些网络的单一应用获得了一个经过修改的发电机,并维持了发电机以前的编辑能力。我们的实验展示了目前状态的可扩展域域域的功能。