Neural radiance fields (NeRF) have demonstrated the potential of coordinate-based neural representation (neural fields or implicit neural representation) in neural rendering. However, using a multi-layer perceptron (MLP) to represent a 3D scene or object requires enormous computational resources and time. There have been recent studies on how to reduce these computational inefficiencies by using additional data structures, such as grids or trees. Despite the promising performance, the explicit data structure necessitates a substantial amount of memory. In this work, we present a method to reduce the size without compromising the advantages of having additional data structures. In detail, we propose using the wavelet transform on grid-based neural fields. Grid-based neural fields are for fast convergence, and the wavelet transform, whose efficiency has been demonstrated in high-performance standard codecs, is to improve the parameter efficiency of grids. Furthermore, in order to achieve a higher sparsity of grid coefficients while maintaining reconstruction quality, we present a novel trainable masking approach. Experimental results demonstrate that non-spatial grid coefficients, such as wavelet coefficients, are capable of attaining a higher level of sparsity than spatial grid coefficients, resulting in a more compact representation. With our proposed mask and compression pipeline, we achieved state-of-the-art performance within a memory budget of 2 MB. Our code is available at https://github.com/daniel03c1/masked_wavelet_nerf.
翻译:神经辐射场(NeRF)证明了基于坐标的神经表示(神经场或隐式神经表示)在神经渲染中的潜力。然而,使用多层感知器(MLP)来表示3D场景或对象需要巨大的计算资源和时间。最近有人研究了如何通过使用额外的数据结构,如网格或树,来减少这些计算效率低下的问题。尽管性能很有前途,但显式数据结构需要大量的内存。在这项工作中,我们提出了一种方法,可以在不损失额外数据结构的优点的情况下减小大小。具体而言,我们建议在基于网格的神经场上使用小波变换。基于网格的神经场用于快速收敛,而小波变换在高性能标准编解码器中的效率已得到证明,可提高网格参数的效率。此外,为了实现更高的网格系数稀疏性,同时保持重建质量,我们提出了一种新颖的可训练掩码方法。实验结果表明,非空间网格系数,例如小波系数,能够达到比空间网格系数更高的稀疏级别,从而实现更紧凑的表示。通过我们提出的掩码和压缩管道,在2 MB的内存预算内实现了最先进的性能。我们的代码可从 https://github.com/daniel03c1/masked_wavelet_nerf 获取。