像素NERF: 一张或几张图像的神经辐射场 (pixelNeRF: Neural Radiance Fields from One or Few Images)

We propose pixelNeRF, a learning framework that predicts a continuous neural scene representation conditioned on one or few input images. The existing approach for constructing neural radiance fields involves optimizing the representation to every scene independently, requiring many calibrated views and significant compute time. We take a step towards resolving these shortcomings by introducing an architecture that conditions a NeRF on image inputs in a fully convolutional manner. This allows the network to be trained across multiple scenes to learn a scene prior, enabling it to perform novel view synthesis in a feed-forward manner from a sparse set of views (as few as one). Leveraging the volume rendering approach of NeRF, our model can be trained directly from images with no explicit 3D supervision. We conduct extensive experiments on ShapeNet benchmarks for single image novel view synthesis tasks with held-out objects as well as entire unseen categories. We further demonstrate the flexibility of pixelNeRF by demonstrating it on multi-object ShapeNet scenes and real scenes from the DTU dataset. In all cases, pixelNeRF outperforms current state-of-the-art baselines for novel view synthesis and single image 3D reconstruction. For the video and code, please visit the project website: https://alexyu.net/pixelnerf

翻译：我们建议像素NeRF, 这个学习框架可以预测以一个或几个输入图像为条件的连续神经场景演示。构建神经光亮场的现有方法包括独立优化每个场景的演示, 需要许多校准的视图和大量计算时间。我们为克服这些缺陷迈出了一步, 我们引入了一个架构, 以完全演进的方式为图像输入设置了 NeRF 条件。这样可以让网络在多个场景中接受培训, 以学习之前的场景, 使其能够从稀少的一组观点( 以一成一数) 以进化的方式进行新的视图合成。利用 NeRF 的体积转换方法, 我们的模式可以直接从没有明确的 3D 监督的图像中进行培训。我们用单个图像网络基准进行广泛的实验, 用于单个图像新奇观综合任务, 以及整个不可见的类别。我们进一步展示了像素NRFRF的灵活性, 通过在多粒子光网图像网场景和 DTU 数据集的真实场景。在所有情况下, pixel NERF 超越了当前状态-art 基线基线基线基线基线, imfyal commus commact网站和单一图像重建。