We present PVSeRF, a learning framework that reconstructs neural radiance fields from single-view RGB images, for novel view synthesis. Previous solutions, such as pixelNeRF, rely only on pixel-aligned features and suffer from feature ambiguity issues. As a result, they struggle with the disentanglement of geometry and appearance, leading to implausible geometries and blurry results. To address this challenge, we propose to incorporate explicit geometry reasoning and combine it with pixel-aligned features for radiance field prediction. Specifically, in addition to pixel-aligned features, we further constrain the radiance field learning to be conditioned on i) voxel-aligned features learned from a coarse volumetric grid and ii) fine surface-aligned features extracted from a regressed point cloud. We show that the introduction of such geometry-aware features helps to achieve a better disentanglement between appearance and geometry, i.e. recovering more accurate geometries and synthesizing higher quality images of novel views. Extensive experiments against state-of-the-art methods on ShapeNet benchmarks demonstrate the superiority of our approach for single-image novel view synthesis.
翻译:我们提出PVSeRF,这是一个从单视图 RGB 图像中重建神经光亮场的学习框架,用于进行新观点合成。之前的解决方案,例如像素NeRF,仅依赖像素粘合特征,并受到特征模糊问题的影响。结果,它们与几何和外观的分解挣扎,导致不可信的地貌和模糊结果。为了应对这一挑战,我们提议纳入明确的几何推理,并将其与像素拉近的场预测特征结合起来。具体地说,除了像素粘合特征外,我们进一步限制光亮场学习以以下条件为条件:i)从粗略的体积网格和ii)从回归点云中提取的浮质粘合特征。我们表明,采用这种地貌特征有助于在外观和几何测量之间实现更好的分解,即恢复更准确的地貌,并合成更高质量的新观点。我们用新颖的图像进行广泛的实验,以(a) 从粗体体积网格和二) 从回归点云云中提取的精细地表特征,展示单一网络的强度模型基准。