In order to function in unstructured environments, robots need the ability to recognize unseen objects. We take a step in this direction by tackling the problem of segmenting unseen object instances in tabletop environments. However, the type of large-scale real-world dataset required for this task typically does not exist for most robotic settings, which motivates the use of synthetic data. Our proposed method, UOIS-Net, separately leverages synthetic RGB and synthetic depth for unseen object instance segmentation. UOIS-Net is comprised of two stages: first, it operates only on depth to produce object instance center votes in 2D or 3D and assembles them into rough initial masks. Secondly, these initial masks are refined using RGB. Surprisingly, our framework is able to learn from synthetic RGB-D data where the RGB is non-photorealistic. To train our method, we introduce a large-scale synthetic dataset of random objects on tabletops. We show that our method can produce sharp and accurate segmentation masks, outperforming state-of-the-art methods on unseen object instance segmentation. We also show that our method can segment unseen objects for robot grasping.
翻译:为了在非结构化环境中发挥作用,机器人需要识别不可见天体的能力。 我们在这方面迈出了一步, 解决了在桌面环境中分割不可见天体事件的问题。 然而, 这项任务所需的大规模真实世界数据集对于大多数机器人环境来说通常并不存在, 从而推动合成数据的使用。 我们的拟议方法, UOIS-Net, 分别将合成 RGB 和合成深度用于不可见天体分离。 UOIS- Net 由两个阶段组成: 首先, 它只能在深度上运行, 在 2D 或 3D 中生成对象实例中心选票, 并将其组合成粗糙的初步掩码。 其次, 这些初始遮罩使用 RGB 进行精细化 。 令人惊讶的是, 我们的框架能够从合成 RGB- D 数据中学习, 而 RGB- D 是非光学性的 。 为了培训我们的方法, 我们在桌面上引入一个大型的随机天体合成数据集。 我们显示, 我们的方法可以产生清晰和准确的分解面面面面面罩, 并显示在不可见天体断的物体上表现的状态方法。