Unsupervised generation of high-quality multi-view-consistent images and 3D shapes using only collections of single-view 2D photographs has been a long-standing challenge. Existing 3D GANs are either compute-intensive or make approximations that are not 3D-consistent; the former limits quality and resolution of the generated images and the latter adversely affects multi-view consistency and shape quality. In this work, we improve the computational efficiency and image quality of 3D GANs without overly relying on these approximations. We introduce an expressive hybrid explicit-implicit network architecture that, together with other design choices, synthesizes not only high-resolution multi-view-consistent images in real time but also produces high-quality 3D geometry. By decoupling feature generation and neural rendering, our framework is able to leverage state-of-the-art 2D CNN generators, such as StyleGAN2, and inherit their efficiency and expressiveness. We demonstrate state-of-the-art 3D-aware synthesis with FFHQ and AFHQ Cats, among other experiments.
翻译:在这项工作中,我们提高了3D GANs 的计算效率和图像质量,而不会过度依赖这些近似。我们引入了一个清晰的表达式混合网络结构,它与其他设计选择一道,不仅实时合成高分辨率多视一致图像,而且生成高质量的3D几何。通过脱钩地貌生成和神经分析,我们的框架能够利用像StyleGAN2这样的最先进的2DCNN发电机,并继承其效率和表现性。我们展示了与FFHQ和AFHQ Cats 进行的最新3D认知合成。