Despite the rapid development of Neural Radiance Field (NeRF), the necessity of dense covers largely prohibits its wider applications. While several recent works have attempted to address this issue, they either operate with sparse views (yet still, a few of them) or on simple objects/scenes. In this work, we consider a more ambitious task: training neural radiance field, over realistically complex visual scenes, by "looking only once", i.e., using only a single view. To attain this goal, we present a Single View NeRF (SinNeRF) framework consisting of thoughtfully designed semantic and geometry regularizations. Specifically, SinNeRF constructs a semi-supervised learning process, where we introduce and propagate geometry pseudo labels and semantic pseudo labels to guide the progressive training process. Extensive experiments are conducted on complex scene benchmarks, including NeRF synthetic dataset, Local Light Field Fusion dataset, and DTU dataset. We show that even without pre-training on multi-view datasets, SinNeRF can yield photo-realistic novel-view synthesis results. Under the single image setting, SinNeRF significantly outperforms the current state-of-the-art NeRF baselines in all cases. Project page: https://vita-group.github.io/SinNeRF/
翻译:尽管神经辐射场(NERF)发展迅速,但密集覆盖的必要性在很大程度上限制了其更广泛的应用。虽然最近的一些工程试图解决这一问题,但它们要么以很少的视角运作(目前还是少数),要么在简单对象/scenes上运作。在这项工作中,我们认为一项更为雄心勃勃的任务:通过“只看一次”,即“只看一次”,在现实复杂的视觉场景上培训神经光亮场,用“只看一次”,即“只看一次”,只用单一的视角来实现这一目标。为了实现这一目标,我们提出了一个单一视图NERF(SinNERF)框架,由精心设计的语义和几何学规范构成。具体地,SinNERF构建了一个半监督的学习进程,我们在此引入和推广了地质测量假标签和语义假标签,以指导逐步培训进程。在复杂的场景基准上进行了广泛的实验,包括NERF合成数据集、本地光场数据集和DTU数据集。我们表明,即使不事先培训多视图数据集,SINERF也可以产生摄影-现实小说合成结果。在单一图像-RF基准中,Sin-RUS-RF privormas practrogrogrogis