Current Generative Adversarial Networks (GANs) produce photorealistic renderings of portrait images. Embedding real images into the latent space of such models enables high-level image editing. While recent methods provide considerable semantic control over the (re-)generated images, they can only generate a limited set of viewpoints and cannot explicitly control the camera. Such 3D camera control is required for 3D virtual and mixed reality applications. In our solution, we use a few images of a face to perform 3D reconstruction, and we introduce the notion of the GAN camera manifold, the key element allowing us to precisely define the range of images that the GAN can reproduce in a stable manner. We train a small face-specific neural implicit representation network to map a captured face to this manifold and complement it with a warping scheme to obtain free-viewpoint novel-view synthesis. We show how our approach - due to its precise camera control - enables the integration of a pre-trained StyleGAN into standard 3D rendering pipelines, allowing e.g., stereo rendering or consistent insertion of faces in synthetic 3D environments. Our solution proposes the first truly free-viewpoint rendering of realistic faces at interactive rates, using only a small number of casual photos as input, while simultaneously allowing semantic editing capabilities, such as facial expression or lighting changes.
翻译:当前生成的Adversarial 网络( GANs) 生成了肖像图像的光现实化图像。 将真实图像嵌入这些模型的潜在空间, 能够进行高层次图像编辑。 虽然最近的方法为( 重新) 生成的图像提供了大量的语义控制, 但是它们只能产生一套有限的视角, 无法明确控制相机。 3D 虚拟和混合现实应用程序需要这种 3D 相机控制 。 在我们的解决方案中, 我们使用一些面部图像来进行 3D 重建, 我们引入了 GAN 相机元件的概念, 这个关键要素使我们能够准确地定义GAN 能够以稳定的方式复制的图像范围。 我们的解决方案培训一个小的面部特定神经隐含代表网络来绘制被捕获的图像, 并且用一个扭曲方案来补充它, 以获得自由浏览点的新图像合成合成合成的合成图像合成合成图像合成的图像合成3D 。 我们的解决方案提出, 将首个面面面像像像素的图像转换速度作为现实的图像转换速度,, 以真实的图像转换速度作为临时的图像转换速度, 。