In this study, we present a method for synthesizing novel views from a single 360-degree RGB-D image based on the neural radiance field (NeRF) . Prior studies relied on the neighborhood interpolation capability of multi-layer perceptrons to complete missing regions caused by occlusion and zooming, which leads to artifacts. In the method proposed in this study, the input image is reprojected to 360-degree RGB images at other camera positions, the missing regions of the reprojected images are completed by a 2D image generative model, and the completed images are utilized to train the NeRF. Because multiple completed images contain inconsistencies in 3D, we introduce a method to learn the NeRF model using a subset of completed images that cover the target scene with less overlap of completed regions. The selection of such a subset of images can be attributed to the maximum weight independent set problem, which is solved through simulated annealing. Experiments demonstrated that the proposed method can synthesize plausible novel views while preserving the features of the scene for both artificial and real-world data.
翻译:在本研究中,我们提出了一种从单个360度RGB-D图像中合成新视角的方法,基于神经辐射场(NeRF)。以往的研究依赖于多层感知器的邻域插值能力来完成由遮挡和缩放引起的缺失区域。这导致产生了一些伪影效果。在本研究提出的方法中,输入图像被重新投影到其他相机位置的全方位RGB图像上,所缺失的区域通过2D图像生成模型进行补全,补全后的图像被用于训练NeRF模型。由于多个补全图像之间存在3D的不一致性,我们引入了一种方法,使用一些覆盖目标场景,但补全区域重叠较少的补全图像子集来训练NeRF模型。选择这样的图像子集可以归因于最大权独立集问题,我们使用模拟退火算法来解决此问题。实验表明,所提出的方法可以在合成人造和真实数据的情况下,保留场景特征并能生成合理的新视角。