In this paper, we present a novel deep neural network architecture for joint class-agnostic object segmentation and grasp detection for robotic picking tasks using a parallel-plate gripper. We introduce depth-aware Coordinate Convolution (CoordConv), a method to increase accuracy for point proposal based object instance segmentation in complex scenes without adding any additional network parameters or computation complexity. Depth-aware CoordConv uses depth data to extract prior information about the location of an object to achieve highly accurate object instance segmentation. These resulting segmentation masks, combined with predicted grasp candidates, lead to a complete scene description for grasping using a parallel-plate gripper. We evaluate the accuracy of grasp detection and instance segmentation on challenging robotic picking datasets, namely Sil\'eane and OCID_grasp, and show the benefit of joint grasp detection and segmentation on a real-world robotic picking task.
翻译:在本文中,我们展示了一种新型的深神经网络结构,用于使用平行板块抓抓器联合进行等级-不可知物体分离和探测机器人选取任务。我们引入了深觉协调变异(CoordConv),这是在复杂场景中提高点提议基于物体分解的准确性的方法,但不增加任何额外的网络参数或计算复杂性。深度觉察Conv使用深度数据来提取关于物体位置的事先信息,以实现高度准确的物体分解。这些结果产生的分解遮罩,加上预测的抓取对象,导致使用平行板抓抓取器来捕捉的完整场景描述。我们评估了在具有挑战性的机器人选取数据集(即Sil\'eane和OCID_grasp)上的抓取发现和实例分解的准确性,并展示了在现实世界的机器人采取任务上共同抓取和分解的好处。