Recent advances in generative adversarial networks (GANs) have led to remarkable achievements in face image synthesis. While methods that use style-based GANs can generate strikingly photorealistic face images, it is often difficult to control the characteristics of the generated faces in a meaningful and disentangled way. Prior approaches aim to achieve such semantic control and disentanglement within the latent space of a previously trained GAN. In contrast, we propose a framework that a priori models physical attributes of the face such as 3D shape, albedo, pose, and lighting explicitly, thus providing disentanglement by design. Our method, MOST-GAN, integrates the expressive power and photorealism of style-based GANs with the physical disentanglement and flexibility of nonlinear 3D morphable models, which we couple with a state-of-the-art 2D hair manipulation network. MOST-GAN achieves photorealistic manipulation of portrait images with fully disentangled 3D control over their physical attributes, enabling extreme manipulation of lighting, facial expression, and pose variations up to full profile view.
翻译:基因对抗网络(GANs)的最近进步导致脸部合成取得了显著成就。虽然使用基于风格的GANs的方法能够产生惊人的光化面部图像,但通常很难以一种有意义和分解的方式控制生成面部的特征。 先前的做法的目的是在以前受过训练的GAN的潜空间内实现这种语义控制和分解。 相反,我们提出了一个框架,让先验模型的面部物理属性,如3D形状、反光、摆放和照明明确,从而通过设计进行分解。 我们的方法(MOST-GAN)将基于样式的GANs的表达力和光真化与非线性 3D 变形模型的物理分解和灵活性结合起来,我们将这些模型与2D 状态的发型操纵网络结合起来。 MOST-GAN对肖像的物理属性进行了光真化操纵,完全分解了3D的3D控制,使得光质、面部表达和成全景的变形。