We present the Versatile Grasp Quality Convolutional Neural Network (VGQ-CNN), a grasp quality prediction network for 6-DOF grasps. VGQ-CNN can be used when evaluating grasps for objects seen from a wide range of camera poses or mobile robots without the need to retrain the network. By defining the grasp orientation explicitly as an input to the network, VGQ-CNN can evaluate 6-DOF grasp poses, moving beyond the 4-DOF grasps used in most image-based grasp evaluation methods like GQ-CNN. To train VGQ-CNN, we generate the new Versatile Grasp dataset (VG-dset) containing 6-DOF grasps observed from a wide range of camera poses. VGQ-CNN achieves a balanced accuracy of 82.1% on our test-split while generalising to a variety of camera poses. Meanwhile, it achieves competitive performance for overhead cameras and top-grasps with a balanced accuracy of 74.2% compared to GQ-CNN's 76.6%. We also propose a modified network architecture, FAST-VGQ-CNN, that speeds up inference using a shared encoder architecture and can make 128 grasp quality predictions in 12ms on a CPU. Code and data are available at https://aucoroboticsmu.github.io/vgq-cnn/.
翻译:我们展示了 Versatile Grasp 质量质变神经网络( VGQ- CNN), 这是6- DOF 掌握的高级质量预测网络 。 在评价从各种相机配置或移动机器人中观察到的物体时, 可以使用 VGQ- CNN, 而无需再对网络进行再培训。 通过明确将抓取方向定义为对网络的一种输入, VGQ- CNN 可以评估6- DOF 抓取配置, 超越了GQ- CNN 等大多数基于图像的抓取评估方法中使用的4- DOF 抓取。 为了培训 VGQ- CNN NN, 我们产生了新的 Versatile Grasp 数据集( VG- dset), 包含6- DOF 的抓取。 VGQQ- ds 。 我们还提出在12 CEVARC 上使用共享的网络结构, 将AST- 和 ASG 升级到12 ARC 。