We study the 3D-aware image attribute editing problem in this paper, which has wide applications in practice. Recent methods solved the problem by training a shared encoder to map images into a 3D generator's latent space or by per-image latent code optimization and then edited images in the latent space. Despite their promising results near the input view, they still suffer from the 3D inconsistency of produced images at large camera poses and imprecise image attribute editing, like affecting unspecified attributes during editing. For more efficient image inversion, we train a shared encoder for all images. To alleviate 3D inconsistency at large camera poses, we propose two novel methods, an alternating training scheme and a multi-view identity loss, to maintain 3D consistency and subject identity. As for imprecise image editing, we attribute the problem to the gap between the latent space of real images and that of generated images. We compare the latent space and inversion manifold of GAN models and demonstrate that editing in the inversion manifold can achieve better results in both quantitative and qualitative evaluations. Extensive experiments show that our method produces more 3D consistent images and achieves more precise image editing than previous work. Source code and pretrained models can be found on our project page: https://mybabyyh.github.io/Preim3D/
翻译:本文研究了三维感知图像属性编辑问题,该问题在实践中有广泛的应用。最近的方法通过训练共享编码器将图像映射到三维生成器的潜空间或通过每个图像的潜空间代码优化来解决问题,然后在潜空间中编辑图像。尽管它们在输入视图附近有很有前途的结果,但它们仍然受到在大的相机姿态下产生的三维不一致性和图像属性编辑不精确的问题的困扰,例如在编辑期间影响未指定的属性。为了更有效地进行图像反演,我们为所有图像训练了一个共享编码器。为了减轻大相机姿态下的三维不一致性,我们提出了两种新方法,交替训练方案和多视图身份损失,以保持三维一致性和主体身份。至于不精确的图像编辑,我们将问题归因于真实图像的潜空间与生成图像的潜空间之间的差距。我们比较了GAN模型的潜空间和反演流形,并证明在反演流形中编辑可以在定量和定性评估中实现更好的结果。广泛的实验表明,我们的方法产生更多的三维一致图像,并实现比以前的方法更精确的图像编辑。源代码和预训练模型可在我们的项目页面上找到:https://mybabyyh.github.io/Preim3D/