Instance segmentation with unseen objects is a challenging problem in unstructured environments. To solve this problem, we propose a robot learning approach to actively interact with novel objects and collect each object's training label for further fine-tuning to improve the segmentation model performance, while avoiding the time-consuming process of manually labeling a dataset. The Singulation-and-Grasping (SaG) policy is trained through end-to-end reinforcement learning. Given a cluttered pile of objects, our approach chooses pushing and grasping motions to break the clutter and conducts object-agnostic grasping for which the SaG policy takes as input the visual observations and imperfect segmentation. We decompose the problem into three subtasks: (1) the object singulation subtask aims to separate the objects from each other, which creates more space that alleviates the difficulty of (2) the collision-free grasping subtask; (3) the mask generation subtask to obtain the self-labeled ground truth masks by using an optical flow-based binary classifier and motion cue post-processing for transfer learning. Our system achieves 70% singulation success rate in simulated cluttered scenes. The interactive segmentation of our system achieves 87.8%, 73.9%, and 69.3% average precision for toy blocks, YCB objects in simulation and real-world novel objects, respectively, which outperforms several baselines.
翻译:在非结构化环境中,隐形物体的弹簧分解是一个挑战性的问题。为了解决这个问题,我们建议采用机器人学习方法,积极与新对象互动,收集每个对象的培训标签,以便进一步微调,改进分解模型的性能,同时避免手工标记数据集的耗时过程。 Sanging-and-Grasping (saG) 政策通过端到端加固学习来培训。鉴于物体堆积不整的堆积,我们的方法选择推动和抓住动作,以打破断裂,并进行对象认知式掌握,而萨格政策作为视觉观察和不完善分解的输入。我们把问题分解成三个子任务:(1) 对象的歌唱子任务旨在将物体相互分离,从而创造更多空间,减轻(2) 无碰撞抓住子塔克(SaaG) 的难度。(3) 遮罩生成子任务,通过使用光学流二进式分类器和动作缩放后处理来获取自标的地面真相遮罩。我们的系统将70 % 和87级缩缩缩缩缩缩缩图。