This paper is on face/head reenactment where the goal is to transfer the facial pose (3D head orientation and expression) of a target face to a source face. Previous methods focus on learning embedding networks for identity and pose disentanglement which proves to be a rather hard task, degrading the quality of the generated images. We take a different approach, bypassing the training of such networks, by using (fine-tuned) pre-trained GANs which have been shown capable of producing high-quality facial images. Because GANs are characterized by weak controllability, the core of our approach is a method to discover which directions in latent GAN space are responsible for controlling facial pose and expression variations. We present a simple pipeline to learn such directions with the aid of a 3D shape model which, by construction, already captures disentangled directions for facial pose, identity and expression. Moreover, we show that by embedding real images in the GAN latent space, our method can be successfully used for the reenactment of real-world faces. Our method features several favorable properties including using a single source image (one-shot) and enabling cross-person reenactment. Our qualitative and quantitative results show that our approach often produces reenacted faces of significantly higher quality than those produced by state-of-the-art methods for the standard benchmarks of VoxCeleb1 & 2. Source code is available at: https://github.com/StelaBou/stylegan_directions_face_reenactment
翻译:本文在面部/ 头部重新反应上, 目标是将目标脸部的面部面部( 3D头方向和表情) 向源面转移。 以往的方法侧重于学习嵌入身份识别网络, 并造成分解, 证明是一项相当艰巨的任务, 降低生成图像的质量。 我们采取了不同的方法, 绕过这些网络的培训, 使用( 调制的) 预先训练过的GAN, 显示能够生成高质量面部图像。 由于GAN 的特点是控制性弱, 我们的方法的核心是找出 GAN 潜在空间中哪些方向负责控制面部和表情变异。 我们展示了一个简单的管道, 借助3D 形状模型来学习这些方向, 通过构建, 已经捕捉到面部面部、 身份和表达的分解方向。 此外, 我们通过将真实图像嵌入 GAN 潜在空间, 我们的方法可以成功用于重现真实世界面部。 我们的方法具有几种有利的属性, 包括使用单一源图像( 一张图片) 和使跨质量 生成我们高质量 的方法。 我们的定量的图像 。