Portrait stylization is a long-standing task enabling extensive applications. Although 2D-based methods have made great progress in recent years, real-world applications such as metaverse and games often demand 3D content. On the other hand, the requirement of 3D data, which is costly to acquire, significantly impedes the development of 3D portrait stylization methods. In this paper, inspired by the success of 3D-aware GANs that bridge 2D and 3D domains with 3D fields as the intermediate representation for rendering 2D images, we propose a novel method, dubbed HyperStyle3D, based on 3D-aware GANs for 3D portrait stylization. At the core of our method is a hyper-network learned to manipulate the parameters of the generator in a single forward pass. It not only offers a strong capacity to handle multiple styles with a single model, but also enables flexible fine-grained stylization that affects only texture, shape, or local part of the portrait. While the use of 3D-aware GANs bypasses the requirement of 3D data, we further alleviate the necessity of style images with the CLIP model being the stylization guidance. We conduct an extensive set of experiments across the style, attribute, and shape, and meanwhile, measure the 3D consistency. These experiments demonstrate the superior capability of our HyperStyle3D model in rendering 3D-consistent images in diverse styles, deforming the face shape, and editing various attributes.
翻译:肖像风格化是一项长期的任务,能够应用广泛。尽管基于2D的方法在近年来取得了很大进展,但现实世界中的应用,如元宇宙和游戏,通常需要3D内容。另一方面,昂贵的获取三维数据的要求显着阻碍了三维肖像风格化方法的发展。本文灵感来自3D感知GAN的成功应用,这种方法通过以3D场为中间表示来桥接2D和3D域,用于渲染2D图像。我们提出了一种基于3D感知GAN的新方法,称为HyperStyle3D,用于3D肖像风格化。我们方法的核心是超网络,能够一次正向传播学习生成器的参数。它不仅具有使用单个模型处理多种风格的强大能力,还能实现灵活的精细风格化,仅影响肖像的纹理、形状或局部部分。虽然使用3D感知GAN的方法绕过了3D数据的要求,但我们进一步缓解了风格图像的必要性,使用CLIP模型作为风格化的引导。我们跨风格、属性和形状进行了广泛的实验,并同时度量了三维一致性。这些实验展示了我们HyperStyle3D模型在以多种风格呈现三维一致的图像、变形脸部形状和编辑各种属性方面的卓越能力。