Neural Radiance Fields (NeRF) have been proposed for photorealistic novel view rendering. However, it requires many different views of one scene for training. Moreover, it has poor generalizations to new scenes and requires retraining or fine-tuning on each scene. In this paper, we develop a new NeRF model for novel view synthesis using only a single image as input. We propose to combine the (coarse) planar rendering and the (fine) volume rendering to achieve higher rendering quality and better generalizations. We also design a depth teacher net that predicts dense pseudo depth maps to supervise the joint rendering mechanism and boost the learning of consistent 3D geometry. We evaluate our method on three challenging datasets. It outperforms state-of-the-art single-view NeRFs by achieving 5$\sim$20\% improvements in PSNR and reducing 20$\sim$50\% of the errors in the depth rendering. It also shows excellent generalization abilities to unseen data without the need to fine-tune on each new scene.
翻译:神经辐射场(NeRF)被提出用于逼真的新视角渲染。然而,它需要许多不同的场景视图进行训练。此外,它对新场景的泛化能力较差,需要在每个场景上重新训练或微调。在本文中,我们开发了一种新的 NeRF 模型,用于仅使用单个图像作为输入的新视角合成。我们建议将(粗略)平面渲染和(精细)体积渲染结合使用,以实现更高的渲染质量和更好的泛化。我们还设计了一个深度教师网络,用于预测密集的伪深度图以监督联合渲染机制,并促进一致的 3D 几何学习。我们在三个具有挑战性的数据集上评估了我们的方法。它通过在 PSNR 中实现 5-20% 的提高并减少深度渲染中 20-50% 的误差,优于最先进的单视图 NeRF。它还表现出优秀的泛化能力,无需在每个新场景上进行微调。