A commonly observed failure mode of Neural Radiance Field (NeRF) is fitting incorrect geometries when given an insufficient number of input views. One potential reason is that standard volumetric rendering does not enforce the constraint that most of a scene's geometry consist of empty space and opaque surfaces. We formalize the above assumption through DS-NeRF (Depth-supervised Neural Radiance Fields), a loss for learning radiance fields that takes advantage of readily-available depth supervision. We leverage the fact that current NeRF pipelines require images with known camera poses that are typically estimated by running structure-from-motion (SFM). Crucially, SFM also produces sparse 3D points that can be used as "free" depth supervision during training: we add a loss to encourage the distribution of a ray's terminating depth matches a given 3D keypoint, incorporating depth uncertainty. DS-NeRF can render better images given fewer training views while training 2-3x faster. Further, we show that our loss is compatible with other recently proposed NeRF methods, demonstrating that depth is a cheap and easily digestible supervisory signal. And finally, we find that DS-NeRF can support other types of depth supervision such as scanned depth sensors and RGB-D reconstruction outputs.
翻译:通常观察到的神经辐射场(NERF)失灵模式是当输入视图数量不足时,一个常见的观察到的神经辐射场(NERF)失灵模式非常适合不正确的地形。一个潜在的原因是,标准的体积分析没有强制力,即场景的多数几何由空空空间和不透明的表面组成。我们通过DS-NERF(Dept-DERF)将上述假设正式化,DS-NERF(D-Deoral Radiance Field Fields)是学习光亮场的一种损失,它利用了现成的深度监督。我们利用目前NERF管道需要的图像,而其已知的摄像面的图像通常通过运行结构(SFM)来估计。 关键是,SFM(SFM)也产生稀有的三维点,在培训期间可以用作“免费”深度监督:我们增加了损失,鼓励分配射线的深度与给定的3DRF(D-NER)关键点相匹配,在培训2-3x较快的同时可以提供更好的图像。我们证明我们的损失与最近提出的其他NERF方法相容相容,表明我们的损失与最近提出的方法相容合,表明深度是便宜且易和易消化的深度监督信号为RFDRFRFD(RFD)的深度监督信号。最后的深度,我们发现,我们可以支持了其他的深度和RFRFD(RB)的频率。最后的频率。