Grasp detection in cluttered scenes is a very challenging task for robots. Generating synthetic grasping data is a popular way to train and test grasp methods, as is Dex-net and GraspNet; yet, these methods generate training grasps on 3D synthetic object models, but evaluate at images or point clouds with different distributions, which reduces performance on real scenes due to sparse grasp labels and covariate shift. To solve existing problems, we propose a novel on-policy grasp detection method, which can train and test on the same distribution with dense pixel-level grasp labels generated on RGB-D images. A Parallel-Depth Grasp Generation (PDG-Generation) method is proposed to generate a parallel depth image through a new imaging model of projecting points in parallel; then this method generates multiple candidate grasps for each pixel and obtains robust grasps through flatness detection, force-closure metric and collision detection. Then, a large comprehensive Pixel-Level Grasp Pose Dataset (PLGP-Dataset) is constructed and released; distinguished with previous datasets with off-policy data and sparse grasp samples, this dataset is the first pixel-level grasp dataset, with the on-policy distribution where grasps are generated based on depth images. Lastly, we build and test a series of pixel-level grasp detection networks with a data augmentation process for imbalance training, which learn grasp poses in a decoupled manner on the input RGB-D images. Extensive experiments show that our on-policy grasp method can largely overcome the gap between simulation and reality, and achieves the state-of-the-art performance. Code and data are provided at https://github.com/liuchunsense/PLGP-Dataset.
翻译:在杂乱的场景中进行 Grasp 探测对于机器人来说是一项非常具有挑战性的任务。 生成合成抓取数据是培训和测试抓取方法的流行方式, Dex- net 和 GraspNet 也是如此; 然而, 这些方法生成了 3D 合成物体模型的培训抓头, 但是在分布不同的图像或点云上进行评估, 由于抓取标签和共差变化, 从而降低真实场景的性能。 为了解决现有的问题, 我们建议了一种新的政策抓取探测方法, 它可以用在 RGB- D 图像上生成的密集像素级抓取标签来培训和测试同一分布。 平行的 Dept- Grasp 图像生成( DPG- Generation) (DG- Generation) 方法被提议通过平行的新的投影模型模型来生成平行的深度图像; 然后, 这种方法会为每个像素的多重候选人抓图, 并通过稳定度检测、 强制闭合度测量度测量度测量和碰撞探测, 数据序列中的数据序列中, 我们的递解析度数据序列中的数据显示。