The ability to grasp objects is an essential skill that enables many robotic manipulation tasks. Recent works have studied point cloud-based methods for object grasping by starting from simulated datasets and have shown promising performance in real-world scenarios. Nevertheless, many of them still rely on ad-hoc geometric heuristics to generate grasp candidates, which fail to generalize to objects with significantly different shapes with respect to those observed during training. Several approaches exploit complex multi-stage learning strategies and local neighborhood feature extraction while ignoring semantic global information. Furthermore, they are inefficient in terms of number of training samples and time required for inference. In this paper, we propose an end-to-end learning solution to generate 6-DOF parallel-jaw grasps starting from the 3D partial view of the object. Our Learning to Grasp (L2G) method gathers information from the input point cloud through a new procedure that combines a differentiable sampling strategy to identify the visible contact points, with a feature encoder that leverages local and global cues. Overall, L2G is guided by a multi-task objective that generates a diverse set of grasps by optimizing contact point sampling, grasp regression, and grasp classification. With a thorough experimental analysis, we show the effectiveness of L2G as well as its robustness and generalization abilities.
翻译:掌握天体的能力是使许多机器人操作任务得以实现的基本技能。 最近的工程从模拟数据集开始,研究了点云计算天体捕捉方法,并展示了现实世界情景中的有希望的性能。 然而,其中许多国家仍然依赖临时的几何超率来生成抓取对象,而这种选取方法未能与培训期间所观察到的形状大不相同的对象相容。 几种方法利用复杂的多阶段学习战略和本地邻居特征提取,同时忽视语义全球信息。 此外,在培训样本的数量和推断所需时间方面效率低下。 在本文件中,我们提出了一个端到端学习解决方案,以产生6-DOF平行拼图抓取方法,从3D部分视图开始。 我们的 Grasp (L2G)方法通过一种新程序收集输入点云中的信息,该程序将不同的取样战略结合起来,以识别可见的接触点,并使用本地和全球提示的特征编码。 总的来说, L2G的定位是多式目标, 指导着一个从多任务到端到终端的目标, 产生6- DOF平行的抓取方法, 并展示一个精确的精确的实验室分析, 。