Great success has been achieved in the 6-DoF grasp learning from the point cloud input, yet the computational cost due to the point set orderlessness remains a concern. Alternatively, we explore the grasp generation from the RGB-D input in this paper. The proposed solution, Keypoint-GraspNet, detects the projection of the gripper keypoints in the image space and then recover the SE(3) poses with a PnP algorithm. A synthetic dataset based on the primitive shape and the grasp family is constructed to examine our idea. Metric-based evaluation reveals that our method outperforms the baselines in terms of the grasp proposal accuracy, diversity, and the time cost. Finally, robot experiments show high success rate, demonstrating the potential of the idea in the real-world applications.
翻译:通过点云输入实现6自由度抓取学习已经取得了巨大成功,但由于点集无序性而产生的计算成本仍然是一个问题。相反,本文探讨了使用RGB-D输入进行抓握生成。提出的解决方案Keypoint-GraspNet,在图像空间中检测夹持器关键点的投影,然后使用PnP算法恢复SE(3)姿态。基于基本形状和抓握族的合成数据集被构建来验证我们的想法。度量评估表明,我们的方法在抓取建议的准确性、多样性和时间成本方面优于基准。最后,机器人实验表现出高的成功率,证明了该想法在实际应用中的潜力。