In contrast to the traditional avatar creation pipeline which is a costly process, contemporary generative approaches directly learn the data distribution from photographs and the state of the arts can now yield highly photo-realistic images. While plenty of works attempt to extend the unconditional generative models and achieve some level of controllability, it is still challenging to ensure multi-view consistency, especially in large poses. In this work, we propose a 3D portrait generation network that produces 3D consistent portraits while being controllable according to semantic parameters regarding pose, identity, expression and lighting. The generative network uses neural scene representation to model portraits in 3D, whose generation is guided by a parametric face model that supports explicit control. While the latent disentanglement can be further enhanced by contrasting images with partially different attributes, there still exists noticeable inconsistency in non-face areas, e.g., hair and background, when animating expressions. We solve this by proposing a volume blending strategy in which we form a composite output by blending the dynamic and static radiance fields, with two parts segmented from the jointly learned semantic field. Our method outperforms prior arts in extensive experiments, producing realistic portraits with vivid expression in natural lighting when viewed in free viewpoint. The proposed method also demonstrates generalization ability to real images as well as out-of-domain cartoon faces, showing great promise in real applications. Additional video results and code will be available on the project webpage.
翻译:与传统的阿凡达创建管道(这是一个昂贵的过程)相比,当代的基因化方法直接从照片和艺术状况中学习数据分布,现在可以产生高光现实化的图像。虽然大量工作试图扩展无条件的基因模型并实现某种程度的可控性,但仍难以确保多视角的一致性,特别是大面部。在这项工作中,我们提议3D肖像生成网络,产生3D一致的肖像,同时根据关于容貌、身份、表达和照明的语义参数加以控制。基因化网络使用神经场展示为3D的模型肖像,而3D模型的生成则以支持明确控制的准度面部模型为指导。虽然通过与部分不同属性的图像对比,可以进一步增强潜在的分解,但在非面领域,特别是大面面面部,特别是大面面部,仍然存在着明显的不一致。我们提出一个量混合战略,通过将动态和静态光亮度场组合成一个复合的图像场,其中两个部分由共同学习的平面图象场进行分解。我们的方法将真实的图像展示能力展示在真实的图象学领域,在真实的图象学前的图象学上展示前的图象学上展示了真实的图象,从而展示了真实的图象,从而展示了真实的图象性地展示了真实的图象,从而展示了真实的图象性地展示了真实的图象性地展示了前的图案。