Existing 3D-aware facial generation methods face a dilemma in quality versus editability: they either generate editable results in low resolution or high-quality ones with no editing flexibility. In this work, we propose a new approach that brings the best of both worlds together. Our system consists of three major components: (1) a 3D-semantics-aware generative model that produces view-consistent, disentangled face images and semantic masks; (2) a hybrid GAN inversion approach that initialize the latent codes from the semantic and texture encoder, and further optimized them for faithful reconstruction; and (3) a canonical editor that enables efficient manipulation of semantic masks in canonical view and product high-quality editing results. Our approach is competent for many applications, e.g. free-view face drawing, editing, and style control. Both quantitative and qualitative results show that our method reaches the state-of-the-art in terms of photorealism, faithfulness, and efficiency.
翻译:现有的3D认知面部生成方法在质量和可编辑性方面面临着两难困境:它们要么产生低分辨率的可编辑结果,要么产生没有编辑灵活性的高质量结果。在这项工作中,我们提出一种新的方法,将两个世界的最好内容汇集在一起。我们的系统由三个主要组成部分组成:(1) 3D识别特征的基因模型,产生视觉一致、面部图像和语义面罩;(2) 混合GAN反向方法,从语义和纹理编码编码中初始化潜在代码,并进一步优化这些代码,以便进行忠实重建;(3) 盲文编辑,以便能够在卡通观点中高效地使用语义面具,并产生高质量的编辑结果。我们的方法适用于许多应用,例如自由视觉面部绘图、编辑和风格控制。定量和定性结果都表明,我们的方法在光现实、忠诚和效率方面达到了最新水平。