The ability to successfully grasp objects is crucial in robotics, as it enables several interactive downstream applications. To this end, most approaches either compute the full 6D pose for the object of interest or learn to predict a set of grasping points. While the former approaches do not scale well to multiple object instances or classes yet, the latter require large annotated datasets and are hampered by their poor generalization capabilities to new geometries. To overcome these shortcomings, we propose to teach a robot how to grasp an object with a simple and short human demonstration. Hence, our approach neither requires many annotated images nor is it restricted to a specific geometry. We first present a small sequence of RGB-D images displaying a human-object interaction. This sequence is then leveraged to build associated hand and object meshes that represent the depicted interaction. Subsequently, we complete missing parts of the reconstructed object shape and estimate the relative transformation between the reconstruction and the visible object in the scene. Finally, we transfer the a-priori knowledge from the relative pose between object and human hand with the estimate of the current object pose in the scene into necessary grasping instructions for the robot. Exhaustive evaluations with Toyota's Human Support Robot (HSR) in real and synthetic environments demonstrate the applicability of our proposed methodology and its advantage in comparison to previous approaches.
翻译:成功捕捉天体的能力在机器人中至关重要, 因为它允许多个互动下游应用程序。 为此, 多数方法要么计算满6D代表感兴趣的对象, 要么学习预测一组抓取点。 虽然前一种方法不适合于多个对象实例或类别, 但后者需要大量的附加说明的数据集, 并且由于它们缺乏概括性能力而难以适应新的地貌。 为了克服这些缺陷, 我们提议教机器人如何捕捉一个简单和短短的人类演示对象。 因此, 我们的方法既不需要许多附加说明的图像, 也不局限于特定的几何。 我们首先展示显示显示人类对象互动的 RGB- D 图像的一小序列。 然后利用该序列来建立相关的手和对象符号, 代表所描述的互动。 随后, 我们完成重建天体形状的缺失部分, 并估计重建与现场可见对象之间的相对变化。 最后, 我们将天体与人类手之间的相对面和对当前物体的估测算, 也不限于特定的几何形状。 我们首先提出一个显示当前物体在机器人中的实际应用性和合成方法中真实性( 支持) 和模拟方法的模拟评估。