Though Neural Radiance Field (NeRF) demonstrates compelling novel view synthesis results, it is still unintuitive to edit a pre-trained NeRF because the neural network's parameters and the scene geometry/appearance are often not explicitly associated. In this paper, we introduce the first framework that enables users to remove unwanted objects or retouch undesired regions in a 3D scene represented by a pre-trained NeRF without any category-specific data and training. The user first draws a free-form mask to specify a region containing unwanted objects over a rendered view from the pre-trained NeRF. Our framework first transfers the user-provided mask to other rendered views and estimates guiding color and depth images within these transferred masked regions. Next, we formulate an optimization problem that jointly inpaints the image content in all masked regions across multiple views by updating the NeRF model's parameters. We demonstrate our framework on diverse scenes and show it obtained visual plausible and structurally consistent results across multiple views using shorter time and less user manual efforts.
翻译:虽然神经辐射场(Neoral Radiance Field)(NeRF)展示了令人信服的新观点合成结果,但编辑一个经过预先训练的 NERF 仍然不切实际,因为神经网络参数和场景几何/外观往往没有明确关联。在本文中,我们引入了第一个框架,使用户能够在3D场景中清除不需要的物体或重新触摸不理想的区域,以事先训练过的NERF为代表,没有任何特定类别的数据和培训。用户首先绘制一个自由格式面具,以指定一个含有不需要的物体的区域,而不是事先训练过的NERF 的视图。我们的框架首先将用户提供的面具转移到其他已提供的观点和估计在这些转移的蒙面区域中显示的颜色和深度图像。 下一步,我们通过更新NERF 模型的参数,将所有蒙面区域的图像内容从多种观点中拼凑起来,我们展示了不同的场景框架,并用较短的时间和较少的用户手工努力来显示它从多种观点中获得的视觉和结构一致的结果。