State-of-the-art 3D-aware generative models rely on coordinate-based MLPs to parameterize 3D radiance fields. While demonstrating impressive results, querying an MLP for every sample along each ray leads to slow rendering. Therefore, existing approaches often render low-resolution feature maps and process them with an upsampling network to obtain the final image. Albeit efficient, neural rendering often entangles viewpoint and content such that changing the camera pose results in unwanted changes of geometry or appearance. Motivated by recent results in voxel-based novel view synthesis, we investigate the utility of sparse voxel grid representations for fast and 3D-consistent generative modeling in this paper. Our results demonstrate that monolithic MLPs can indeed be replaced by 3D convolutions when combining sparse voxel grids with progressive growing, free space pruning and appropriate regularization. To obtain a compact representation of the scene and allow for scaling to higher voxel resolutions, our model disentangles the foreground object (modeled in 3D) from the background (modeled in 2D). In contrast to existing approaches, our method requires only a single forward pass to generate a full 3D scene. It hence allows for efficient rendering from arbitrary viewpoints while yielding 3D consistent results with high visual fidelity.
翻译:以基于协调的 MLP 为基础, 将 3D 弧度字段参数化 。 在展示令人印象深刻的结果的同时, 询问每个光谱样本的 MLP 也会导致缓慢形成。 因此, 现有方法往往使低分辨率地貌图, 并用高模网络处理这些图, 以获取最终图像。 尽管效率高, 神经转换往往会纠缠观点和内容, 从而改变相机会导致不想要的几何或外观变化 。 受基于 voxel 的新视图合成最近的结果的激励, 我们调查了本文中稀有的 voxel 网格表示器对快速和 3D 相容的基因模型的效用。 我们的结果显示, 当将稀有的 voxel 网格与不断增长、 自由空间调整和适当规范相结合时, 单立式的 MLPPs 确实可以被 3D 拼图所取代 。 为了将稀有的 voxel 网格与不断增长的 Voxel 、 和 更高级的 Voxel 决议, 我们的模型将原始对象( 3D 建为模型) 从一个连续的直观的直观的图像方法, 向前, 需要一种连续的直观分析方法。