Making generative models 3D-aware bridges the 2D image space and the 3D physical world yet remains challenging. Recent attempts equip a Generative Adversarial Network (GAN) with a Neural Radiance Field (NeRF), which maps 3D coordinates to pixel values, as a 3D prior. However, the implicit function in NeRF has a very local receptive field, making the generator hard to become aware of the global structure. Meanwhile, NeRF is built on volume rendering which can be too costly to produce high-resolution results, increasing the optimization difficulty. To alleviate these two problems, we propose a novel framework, termed as VolumeGAN, for high-fidelity 3D-aware image synthesis, through explicitly learning a structural representation and a textural representation. We first learn a feature volume to represent the underlying structure, which is then converted to a feature field using a NeRF-like model. The feature field is further accumulated into a 2D feature map as the textural representation, followed by a neural renderer for appearance synthesis. Such a design enables independent control of the shape and the appearance. Extensive experiments on a wide range of datasets show that our approach achieves sufficiently higher image quality and better 3D control than the previous methods.
翻译:3D-aware 的基因模型将 2D 图像空间和 3D 物理世界连接起来,但最近的一些尝试仍然具有挑战性。最近一些尝试试图装备一个以神经辐射场(NeRF) 绘制3D 等值坐标的3D像素模型(NERF) 。然而, NERF 的隐含功能具有非常局部的可接受域,使生成者很难了解全球结构。 与此同时, NERF 建在体积图解上,其成本可能太高,无法产生高分辨率结果,从而增加优化难度。为了缓解这两个问题,我们提议了一个名为 卷GAN 的新型框架, 用于高纤维3D-aware图像合成(Neal Radiance Field) 。 我们首先学习了一个功能量来代表基本结构, 然后用类似 NERF 的模型转换为特性字段。 特性字段被进一步积累在2D 特征图中, 作为质谱表达,, 并随后为神经合成器合成 。为了缓解这两个问题, 我们提出这样的设计能够独立控制形状和外观, 。 在足够质量方法上进行广泛的实验。