Recently 3D-aware GAN methods with neural radiance field have developed rapidly. However, current methods model the whole image as an overall neural radiance field, which limits the partial semantic editability of synthetic results. Since NeRF renders an image pixel by pixel, it is possible to split NeRF in the spatial dimension. We propose a Compositional Neural Radiance Field (CNeRF) for semantic 3D-aware portrait synthesis and manipulation. CNeRF divides the image by semantic regions and learns an independent neural radiance field for each region, and finally fuses them and renders the complete image. Thus we can manipulate the synthesized semantic regions independently, while fixing the other parts unchanged. Furthermore, CNeRF is also designed to decouple shape and texture within each semantic region. Compared to state-of-the-art 3D-aware GAN methods, our approach enables fine-grained semantic region manipulation, while maintaining high-quality 3D-consistent synthesis. The ablation studies show the effectiveness of the structure and loss function used by our method. In addition real image inversion and cartoon portrait 3D editing experiments demonstrate the application potential of our method.
翻译:近年来,基于神经辐射场的3D感知GAN方法得到了快速发展。然而,当前方法将整张图片视为整体神经辐射场,限制了合成结果的局部语义可编辑性。由于NeRF逐像素渲染图像,因此可以将NeRF在空间维度上拆分。我们提出了一种基于组合神经辐射场的语义3D感知肖像合成和操作方法。CNeRF将图像按语义区域划分,并为每个区域学习独立的神经辐射场,最后将它们融合并渲染完整的图像。因此,我们可以独立操作合成的语义区域,同时保持其他部分不变。此外,CNeRF还设计成在每个语义区域内解耦形状和纹理。与最先进的3D感知GAN方法相比,我们的方法使得细粒度的语义区域操作成为可能,同时保持高质量的3D一致性合成。消融实验展示了我们方法使用的结构和损失函数的有效性。此外,图片反演和卡通肖像3D编辑实验展示了我们方法的应用潜力。