Robot learning is often simplified to planar manipulation due to its data consumption. Then, a common approach is to use a fully-convolutional neural network to estimate the reward of grasp primitives. In this work, we extend this approach by parametrizing the two remaining, lateral Degrees of Freedom (DoFs) of the primitives. We apply this principle to the task of 6 DoF bin picking: We introduce a model-based controller to calculate angles that avoid collisions, maximize the grasp quality while keeping the uncertainty small. As the controller is integrated into the training, our hybrid approach is able to learn about and exploit the model-based controller. After real-world training of 27000 grasp attempts, the robot is able to grasp known objects with a success rate of over 92% in dense clutter. Grasp inference takes less than 50ms. In further real-world experiments, we evaluate grasp rates in a range of scenarios including its ability to generalize to unknown objects. We show that the system is able to avoid collisions, enabling grasps that would not be possible without primitive adaption.
翻译:机器人学习通常因其数据消耗而简化, 以便进行编程操作。 然后, 一个共同的方法是使用一个完全进化的神经网络来估计握住原始人的奖赏。 在这项工作中, 我们通过对原始人剩下的两个, 平级自由度( DoFs) 进行配对来扩展这一方法。 我们将这一原则应用到 6 DoF bin 选择的任务中 : 我们引入一个基于模型的控制器来计算角度, 避免碰撞, 最大限度地提高掌握质量, 同时又保持小的不确定性 。 当控制器被纳入培训时, 我们的混合方法能够学习和利用基于模型的控制器。 在对 27000 个抓住尝试进行真实世界的训练后, 机器人能够捕捉到已知的物体, 在密质中的成功率超过92% 。 引推论不到50 。 在进一步的现实世界实验中, 我们评估一系列情景的掌握率, 包括它能够对未知对象进行概括。 我们显示, 系统能够避免碰撞, 使得没有原始适应的可能性。