Neural Radiance Field (NeRF) has broken new ground in the novel view synthesis due to its simple concept and state-of-the-art quality. However, it suffers from severe performance degradation unless trained with a dense set of images with different camera poses, which hinders its practical applications. Although previous methods addressing this problem achieved promising results, they relied heavily on the additional training resources, which goes against the philosophy of sparse-input novel-view synthesis pursuing the training efficiency. In this work, we propose MixNeRF, an effective training strategy for novel view synthesis from sparse inputs by modeling a ray with a mixture density model. Our MixNeRF estimates the joint distribution of RGB colors along the ray samples by modeling it with mixture of distributions. We also propose a new task of ray depth estimation as a useful training objective, which is highly correlated with 3D scene geometry. Moreover, we remodel the colors with regenerated blending weights based on the estimated ray depth and further improves the robustness for colors and viewpoints. Our MixNeRF outperforms other state-of-the-art methods in various standard benchmarks with superior efficiency of training and inference.
翻译:神经辐射场(NeRF)由于其简单的概念和最先进的质量,在新视角合成方面取得了突破性进展。然而,它在未用不同相机姿势拍摄的密集图像集训练时,性能会严重下降,这阻碍了其实际应用。虽然以前的解决这个问题的方法取得了有希望的结果,但它们严重依赖于额外的训练资源,这与追求训练效率的稀疏输入新视角综合理念背道而驰。在这项工作中,我们提出了MixNeRF,一种通过使用混合密度模型对光线进行建模从而实现稀疏输入下新视角综合的有效训练策略。我们的MixNeRF模拟了沿光线样本的RGB颜色的联合分布,通过将其建模成分布的混合物来完成。我们还提出了一项新的光线深度估计任务作为有用的训练目标,这与3D场景几何高度相关。此外,我们还根据估计的光线深度重新生成融合权重来重新建模颜色,并进一步提高了颜色和视点的鲁棒性。我们的MixNeRF在各种标准基准测试中优于其他最先进的方法,并具有更高效的训练和推理能力。