With the advent of Neural Radiance Fields (NeRF), neural networks can now render novel views of a 3D scene with quality that fools the human eye. Yet, generating these images is very computationally intensive, limiting their applicability in practical scenarios. In this paper, we propose a technique based on spatial decomposition capable of mitigating this issue. Our key observation is that there are diminishing returns in employing larger (deeper and/or wider) networks. Hence, we propose to spatially decompose a scene and dedicate smaller networks for each decomposed part. When working together, these networks can render the whole scene. This allows us near-constant inference time regardless of the number of decomposed parts. Moreover, we show that a Voronoi spatial decomposition is preferable for this purpose, as it is provably compatible with the Painter's Algorithm for efficient and GPU-friendly rendering. Our experiments show that for real-world scenes, our method provides up to 3x more efficient inference than NeRF (with the same rendering quality), or an improvement of up to 1.0~dB in PSNR (for the same inference cost).
翻译:随着神经辐射场(NERF)的出现,神经网络现在可以对3D场景提出新的观点,这种场景的质量会愚弄人类的眼睛。然而,生成这些图像的计算强度非常大,限制了这些图像在实际场景中的适用性。在本文中,我们提出一种基于空间分解的技术,能够缓解这一问题。我们的主要观察是,在使用更大的(更深和/或更广泛的)网络时,回报率越来越低。因此,我们建议空间拆解一个场景,为每个腐烂的部分专门设置较小的网络。如果合作,这些网络可以形成整个场景。这使我们得以在接近一致的推论时间,而不管腐烂部分的数量如何。此外,我们表明,为了这个目的,Voronooi空间分解是可取的,因为它与油漆师的阿尔戈里特姆(Algorithm)的高效和GPUPP-友好型的推演。我们的实验显示,对于真实的场景来说,我们的方法比NRF(以同样的推算质量)提供3x更高的推论,或改进到P-B(PR)的推算)。