Neural Radiance Fields (NeRF) achieve photo-realistic view synthesis with densely captured input images. However, the geometry of NeRF is extremely under-constrained given sparse views, resulting in significant degradation of novel view synthesis quality. Inspired by self-supervised depth estimation methods, we propose StructNeRF, a solution to novel view synthesis for indoor scenes with sparse inputs. StructNeRF leverages the structural hints naturally embedded in multi-view inputs to handle the unconstrained geometry issue in NeRF. Specifically, it tackles the texture and non-texture regions respectively: a patch-based multi-view consistent photometric loss is proposed to constrain the geometry of textured regions; for non-textured ones, we explicitly restrict them to be 3D consistent planes. Through the dense self-supervised depth constraints, our method improves both the geometry and the view synthesis performance of NeRF without any additional training on external data. Extensive experiments on several real-world datasets demonstrate that StructNeRF surpasses state-of-the-art methods for indoor scenes with sparse inputs both quantitatively and qualitatively.
翻译:神经辐射场(NERF)通过大量采集的输入图像实现光现实化的图像合成。然而,NERF的几何特征极其受限制,因为很少的视图,导致新观点合成质量的显著退化。在自我监督的深度估算方法的启发下,我们提议SstructNeRF,这是室内场景新观点合成的解决方案。 StructNeRF利用在多视角输入中自然嵌入的结构提示来处理NERF中不受限制的几何问题。具体地说,它分别处理质谱和非质谱区域:建议一种基于补丁多视角的多视角连续光度测量损失,以限制光度区域的几何测量;对于非色谱区,我们明确限制它们为3D一致性的平面。通过密集的自我监督深度限制,我们的方法改进了NERF的几度测量和合成性能,而没有就外部数据进行任何额外培训。在几个现实世界数据集进行的广泛实验表明,Stract NeRF在数质和量质上都超越了室内图像的状态。