We propose StyleNeRF, a 3D-aware generative model for photo-realistic high-resolution image synthesis with high multi-view consistency, which can be trained on unstructured 2D images. Existing approaches either cannot synthesize high-resolution images with fine details or yield noticeable 3D-inconsistent artifacts. In addition, many of them lack control over style attributes and explicit 3D camera poses. StyleNeRF integrates the neural radiance field (NeRF) into a style-based generator to tackle the aforementioned challenges, i.e., improving rendering efficiency and 3D consistency for high-resolution image generation. We perform volume rendering only to produce a low-resolution feature map and progressively apply upsampling in 2D to address the first issue. To mitigate the inconsistencies caused by 2D upsampling, we propose multiple designs, including a better upsampler and a new regularization loss. With these designs, StyleNeRF can synthesize high-resolution images at interactive rates while preserving 3D consistency at high quality. StyleNeRF also enables control of camera poses and different levels of styles, which can generalize to unseen views. It also supports challenging tasks, including zoom-in and-out, style mixing, inversion, and semantic editing.
翻译:我们提议StyleNeRF, 3D-觉察到的3D高分辨率图像合成的3D型基因模型,该模型可以用于高分辨率图像生成的高度多维一致性。现有的方法既不能将高分辨率图像与不结构的2D图像合成,也不能产生细细细的3D不一致的明显文物。此外,许多方法对样式属性缺乏控制,而且有明显的3D相机配置。StyleNeRF将神经光亮场(NeRF)整合成一个基于风格的生成器,以应对上述挑战,即提高高分辨率图像生成的效率和3D一致性。我们进行量成型只能制作低分辨率特征地图,并逐步在2D中进行上层取样,以解决第一个问题。为了减轻2D上层取样造成的不一致,我们提出了多项设计,包括更好的放大器和新的规范丢失。有了这些设计,StyleNERF可以以互动速度合成高分辨率图像,同时保持3D的高质量一致性。StyNERF还能够控制相机的配置和不同层次的风格,可以对视像质进行概括化和感光学式的混合。它也支持了具有挑战性的任务,包括感光学和感变。