Existing GAN inversion and editing methods work well for aligned objects with a clean background, such as portraits and animal faces, but often struggle for more difficult categories with complex scene layouts and object occlusions, such as cars, animals, and outdoor images. We propose a new method to invert and edit such complex images in the latent space of GANs, such as StyleGAN2. Our key idea is to explore inversion with a collection of layers, spatially adapting the inversion process to the difficulty of the image. We learn to predict the "invertibility" of different image segments and project each segment into a latent layer. Easier regions can be inverted into an earlier layer in the generator's latent space, while more challenging regions can be inverted into a later feature space. Experiments show that our method obtains better inversion results compared to the recent approaches on complex categories, while maintaining downstream editability. Please refer to our project page at https://www.cs.cmu.edu/~SAMInversion.
翻译:现有的 GAN 转换和编辑方法对于具有清洁背景的对齐对象非常有效,例如肖像和动物脸部等,但往往要为复杂的场景布局和物体隔离(如汽车、动物和室外图像)等更困难的类别而挣扎。我们提出了一个新的方法来倒置和编辑GAN的潜层,如StyleGAN2。 我们的关键想法是探索与层集的反向,在空间上将反向进程适应图像的困难。 我们学会预测不同图像部分的“不可视性”和将每个部分投入潜层。 较容易的区域可以倒入发电机潜在空间的早期层, 而更具挑战性的区域可以倒入以后的特性空间。 实验表明,我们的方法在保持下游可编辑性的同时,比最近对复杂类别的做法获得更好的反向结果。 请参见我们的项目网页 https://www.cs. cmu.edu/ ~SAMIENV/ 。