Neural radiance fields (NeRF) encode a scene into a neural representation that enables photo-realistic rendering of novel views. However, a successful reconstruction from RGB images requires a large number of input views taken under static conditions - typically up to a few hundred images for room-size scenes. Our method aims to synthesize novel views of whole rooms from an order of magnitude fewer images. To this end, we leverage dense depth priors in order to constrain the NeRF optimization. First, we take advantage of the sparse depth data that is freely available from the structure from motion (SfM) preprocessing step used to estimate camera poses. Second, we use depth completion to convert these sparse points into dense depth maps and uncertainty estimates, which are used to guide NeRF optimization. Our method enables data-efficient novel view synthesis on challenging indoor scenes, using as few as 18 images for an entire scene.
翻译:神经亮度场( NERF) 将场景编码为神经显示, 从而能够对新观点进行摄影现实化的展示。 但是, 从 RGB 图像进行成功重建, 需要在静态条件下进行大量输入视图 -- -- 通常为室内大小图像提供几百张图像。 我们的方法旨在将整个房间的新观点从数量级的图像中合成出来, 减少图像的大小。 为此, 我们利用密集深度前缀来限制 NERF 优化。 首先, 我们利用从结构中可自由获取的从运动( SfM) 预处理步骤中可自由获取的稀薄深度数据来估计相机的构成。 其次, 我们利用深度完成来将这些稀有点转换为密度深度地图和不确定性估计, 用于指导 NERF 优化。 我们的方法可以在挑战的室内场面上进行数据高效的新视图合成, 整个场景只有18张图像。