In this paper, we propose MINE to perform novel view synthesis and depth estimation via dense 3D reconstruction from a single image. Our approach is a continuous depth generalization of the Multiplane Images (MPI) by introducing the NEural radiance fields (NeRF). Given a single image as input, MINE predicts a 4-channel image (RGB and volume density) at arbitrary depth values to jointly reconstruct the camera frustum and fill in occluded contents. The reconstructed and inpainted frustum can then be easily rendered into novel RGB or depth views using differentiable rendering. Extensive experiments on RealEstate10K, KITTI and Flowers Light Fields show that our MINE outperforms state-of-the-art by a large margin in novel view synthesis. We also achieve competitive results in depth estimation on iBims-1 and NYU-v2 without annotated depth supervision. Our source code is available at https://github.com/vincentfung13/MINE
翻译:在本文中,我们建议MIME通过从单一图像进行密度3D重建,进行新的视图合成和深度估计。我们的做法是通过引入神经光场(NERF),对多平板图像进行连续深度概括。如果将单一图像作为输入,MIME以任意深度值预测4通道图像(RGB和体积密度),以联合重建相机的鲁斯图和填充隐蔽内容。然后,利用不同的图像,将经过重建的和涂漆的浮壳很容易转化为新的RGB或深度视图。关于RealEstate10K、KITTI和Flowers光场的广泛实验显示,我们的MIE在新观点合成中以大边缘形式超越了艺术状态。我们还在没有附加说明的深度监督的情况下,在iBims-1和NYU-V2的深度估算中取得了竞争性结果。我们的源代码可在http://github.com/vincfung13/MINE上查阅。