Casually captured Neural Radiance Fields (NeRFs) suffer from artifacts such as floaters or flawed geometry when rendered outside the camera trajectory. Existing evaluation protocols often do not capture these effects, since they usually only assess image quality at every 8th frame of the training capture. To push forward progress in novel-view synthesis, we propose a new dataset and evaluation procedure, where two camera trajectories are recorded of the scene: one used for training, and the other for evaluation. In this more challenging in-the-wild setting, we find that existing hand-crafted regularizers do not remove floaters nor improve scene geometry. Thus, we propose a 3D diffusion-based method that leverages local 3D priors and a novel density-based score distillation sampling loss to discourage artifacts during NeRF optimization. We show that this data-driven prior removes floaters and improves scene geometry for casual captures.
翻译:轻松捕捉的神经辐射场(NeRFs)在摄像机轨迹外渲染时会产生浮动或有缺陷的几何图像等特征。现有的评估方法通常无法捕捉到这些效应,因为它们通常只在训练捕捉的每第8帧处评估图像质量。为了推进新的视角综合进展,我们提出了一个新的数据集和评估方法,其中记录了场景的两条相机轨迹:一条用于训练,另一条用于评估。在这种更具挑战性的野外环境中,我们发现现有的手工制定的规则无法消除浮动特征,也无法改善场景几何图像。因此,我们提出了一种基于3D扩散的方法,利用局部3D先验和一种新的基于密度的得分蒸馏抽样损失,在NeRF优化期间避免产生这些人工干扰的特征。我们展示了这一数据驱动的先验方法能够消除浮动特征并改善随意捕捉时的场景几何图像。