Contemporary grasp detection approaches employ deep learning to achieve robustness to sensor and object model uncertainty. The two dominant approaches design either grasp-quality scoring or anchor-based grasp recognition networks. This paper presents a different approach to grasp detection by treating it as keypoint detection. The deep network detects each grasp candidate as a pair of keypoints, convertible to the grasp representation g = {x, y, w, {\theta}}^T, rather than a triplet or quartet of corner points. Decreasing the detection difficulty by grouping keypoints into pairs boosts performance. To further promote dependencies between keypoints, the general non-local module is incorporated into the proposed learning framework. A final filtering strategy based on discrete and continuous orientation prediction removes false correspondences and further improves grasp detection performance. GKNet, the approach presented here, achieves the best balance of accuracy and speed on the Cornell and the abridged Jacquard dataset (96.9% and 98.39% at 41.67 and 23.26 fps). Follow-up experiments on a manipulator evaluate GKNet using 4 types of grasping experiments reflecting different nuisance sources: static grasping, dynamic grasping, grasping at varied camera angles, and bin picking. GKNet outperforms reference baselines in static and dynamic grasping experiments while showing robustness to varied camera viewpoints and bin picking experiments. The results confirm the hypothesis that grasp keypoints are an effective output representation for deep grasp networks that provide robustness to expected nuisance factors.
翻译:当代掌握检测方法采用深层次的学习方法,使感官和对象模型的不确定性达到稳健性。 两种主导方法要么设计掌握质量的评分或基于锁定的控点识别网络。 本文展示了一种不同的方法, 将它作为关键点检测来掌握检测。 深层次的网络将每个被抓住的候选人都检测成一组关键点, 可以转换到 catch 代表g = {x, y, w,, ~theta ⁇ T, 而不是角点的三重点或四重四重点。 通过将关键点组合成对配对键来减少检测难度, 提高性能。 为了进一步促进关键点之间的依赖性, 将一般非本地模块纳入拟议的学习框架。 基于离散和连续定向预测的最后过滤战略, 消除虚假的对应, 进一步提高检测性业绩。 GKNet, 此处介绍的方法在Cornell 上实现了最佳的准确性和速度平衡, 以及缩略的 Jacqurd数据集 (96.9%和98.39% 用于41. 67 和23.26 fps ) 。 后续实验关于调调校正的调的调的GKNet评价GKNet 的精确的精确度, 的精确度, 的精确度, 和精确度的精确度, 和精确度的精确度的精确度, 和精确度, 和精确度的精确度的精确度, 显示的精确度,,,,, 显示的实验, 显示的精确度为: 显示的精确度的精确度的精确度的精确度,,,,, 显示的精确度的精确度的精确度, 显示的精确度的精确度的精确度,,,,,,,,, 和精确度的精确度的精确度的精确度, 和精确度,,,,,,, 和精确的精确的精确的精确的精确的精确的精确的精确度,, 和精确的精确的精确度, 和精确度,,, 和精确的精确的精确度,, 和精确度, 和精确的精确的精确的实验,,,,,,