Face swapping aims at injecting a source image's identity (i.e., facial features) into a target image, while strictly preserving the target's attributes, which are irrelevant to identity. However, we observed that previous approaches still suffer from source attribute leakage, where the source image's attributes interfere with the target image's. In this paper, we analyze the latent space of StyleGAN and find the adequate combination of the latents geared for face swapping task. Based on the findings, we develop a simple yet robust face swapping model, RobustSwap, which is resistant to the potential source attribute leakage. Moreover, we exploit the coordination of 3DMM's implicit and explicit information as a guidance to incorporate the structure of the source image and the precise pose of the target image. Despite our method solely utilizing an image dataset without identity labels for training, our model has the capability to generate high-fidelity and temporally consistent videos. Through extensive qualitative and quantitative evaluations, we demonstrate that our method shows significant improvements compared with the previous face swapping models in synthesizing both images and videos. Project page is available at https://robustswap.github.io/
翻译:面部交换旨在将源图像的身份(即面部特征)注入到目标图像中,同时严格保留目标图像的与身份无关的属性。然而,我们观察到先前的方法仍然受到源属性泄露的影响,即源图像的属性干扰了目标图像的属性。在本文中,我们分析了StyleGAN的潜在空间,并找到了适用于面部交换任务的潜在空间的合适组合。基于这些发现,我们开发了一个简单而健壮的面部交换模型RobustSwap,它能够抵御潜在的源属性泄露。此外,我们利用3DMM的隐式和显式信息的协同作为向导,将源图像的结构和目标图像的精确姿态结合起来。尽管我们的方法仅使用一个没有标有身份标签的图像数据集进行训练,但我们的模型具有生成高保真度和时间上连贯的视频的能力。通过广泛的定性和定量评估,我们证明了我们的方法在合成图像和视频方面相比先前的面部交换模型有显著的改进。项目页面可在https://robustswap.github.io/上获得。