This paper presents one of the first learning-based NeRF 3D instance segmentation pipelines, dubbed as Instance Neural Radiance Field, or Instance NeRF. Taking a NeRF pretrained from multi-view RGB images as input, Instance NeRF can learn 3D instance segmentation of a given scene, represented as an instance field component of the NeRF model. To this end, we adopt a 3D proposal-based mask prediction network on the sampled volumetric features from NeRF, which generates discrete 3D instance masks. The coarse 3D mask prediction is then projected to image space to match 2D segmentation masks from different views generated by existing panoptic segmentation models, which are used to supervise the training of the instance field. Notably, beyond generating consistent 2D segmentation maps from novel views, Instance NeRF can query instance information at any 3D point, which greatly enhances NeRF object segmentation and manipulation. Our method is also one of the first to achieve such results without ground-truth instance information during inference. Experimented on synthetic and real-world NeRF datasets with complex indoor scenes, Instance NeRF surpasses previous NeRF segmentation works and competitive 2D segmentation methods in segmentation performance on unseen views. See the demo video at https://youtu.be/wW9Bme73coI.
翻译:本文提出了一种基于学习的 NeRF 三维实例分割流程,称为 Instance Neural Radiance Field,简称 Instance NeRF。Instance NeRF 在多视角 RGB 图像预训练的 NeRF 模型输入上,可以学习给定场景的三维实例分割,表示为 NeRF 模型的实例分量字段。为此,我们采用了一个三维基于建议的掩膜预测网络,通过NeRF样本体积特征生成离散的三维实例掩膜。粗略的 3D 掩膜预测会被投影到图像空间,以匹配由现有的全景分割模型生成的不同视角的 2D 分割掩膜,这些掩膜用于监督实例场的培训。值得注意的是,除了从新视角生成一致的 2D 分割地图外,Instance NeRF 还可以在任何 3D 点查询实例信息,从而极大地增强了 NeRF 对象分割和操作。此外,我们的方法也是在推理期间不需要地面真实实例信息而实现这样的结果之一。在复杂的室内场景的合成和真实 NeRF 数据集上进行了实验,Instance NeRF 在全景分割结果上优于之前的 NeRF 分割结果和竞争的 2D 分割方法。详见演示视频 https://youtu.be/wW9Bme73coI。