Removing clutter from scenes is essential in many applications, ranging from privacy-concerned content filtering to data augmentation. In this work, we present an automatic system that removes clutter from 3D scenes and inpaints with coherent geometry and texture. We propose techniques for its two key components: 3D segmentation from shared properties and 3D inpainting, both of which are important porblems. The definition of 3D scene clutter (frequently-moving objects) is not well captured by commonly-studied object categories in computer vision. To tackle the lack of well-defined clutter annotations, we group noisy fine-grained labels, leverage virtual rendering, and impose an instance-level area-sensitive loss. Once clutter is removed, we inpaint geometry and texture in the resulting holes by merging inpainted RGB-D images. This requires novel voting and pruning strategies that guarantee multi-view consistency across individually inpainted images for mesh reconstruction. Experiments on ScanNet and Matterport dataset show that our method outperforms baselines for clutter segmentation and 3D inpainting, both visually and quantitatively.
翻译:摘要:从场景中移除杂物是许多应用程序中的关键操作,包括隐私相关内容过滤和数据增强。本文提出了一个自动系统,能够从3D场景中去除杂物,并平滑修复几何和纹理。我们提出了这两个关键组成部分的技术:基于共享属性的3D分割和3D修复。3D场景中杂物(经常移动的物体)的定义并不完全被计算机视觉中通常研究的物体类别所捕捉。为了解决缺乏明确定义的杂物注释的问题,我们将嘈杂的细粒度标签分组、利用虚拟渲染,并施加实例级别的面积敏感损失。一旦去除了杂物,我们通过合并修复后的RGB-D图像来修复在结果中产生的几何和纹理空洞。这需要投票和修剪策略,以保证单独修复的图像在网格重建中具有多视角一致性。在ScanNet和Matterport数据集上的实验证明,我们的方法在杂物分割和3D修补方面优于基准方法,无论是在视觉上还是在定量上。