In this work, we introduce a novel, end-to-end trainable CNN-based architecture to deliver high quality results for grasp detection suitable for a parallel-plate gripper, and semantic segmentation. Utilizing this, we propose a novel refinement module that takes advantage of previously calculated grasp detection and semantic segmentation and further increases grasp detection accuracy. Our proposed network delivers state-of-the-art accuracy on two popular grasp dataset, namely Cornell and Jacquard. As additional contribution, we provide a novel dataset extension for the OCID dataset, making it possible to evaluate grasp detection in highly challenging scenes. Using this dataset, we show that semantic segmentation can additionally be used to assign grasp candidates to object classes, which can be used to pick specific objects in the scene.
翻译:在这项工作中,我们引入了一个新型的、端到端的、可训练的CNN结构,以提供高质量的结果,用于探测适合平行板抓抓器和语义分解。利用这个结构,我们提出一个新的改进模块,利用先前计算的抓捉探测和语义分解,进一步提高探测的准确性。我们提议的网络在两个流行的抓取数据集,即Cornell和Jacquard上提供了最先进的准确性。作为额外的贡献,我们为OCID数据集提供了一个新的数据集扩展,从而有可能评估在极具挑战性的场景中捕捉探测。我们利用这一数据集,我们表明可以额外地利用语义分解来分配被抓住的候选人到目标类,这些类可用于在现场选择特定对象。