Dexterous robotic hands have the capability to interact with a wide variety of household objects to perform tasks like grasping. However, learning robust real world grasping policies for arbitrary objects has proven challenging due to the difficulty of generating high quality training data. In this work, we propose a learning system (ISAGrasp) for leveraging a small number of human demonstrations to bootstrap the generation of a much larger dataset containing successful grasps on a variety of novel objects. Our key insight is to use a correspondence-aware implicit generative model to deform object meshes and demonstrated human grasps in order to generate a diverse dataset of novel objects and successful grasps for supervised learning, while maintaining semantic realism. We use this dataset to train a robust grasping policy in simulation which can be deployed in the real world. We demonstrate grasping performance with a four-fingered Allegro hand in both simulation and the real world, and show this method can handle entirely new semantic classes and achieve a 79% success rate on grasping unseen objects in the real world.
翻译:不相干机器人手有能力与各种各样的家用物体进行互动,以完成诸如抓捕等任务。然而,学习强大的真实世界捕捉任意物体的政策证明具有挑战性,因为很难生成高质量的培训数据。在这项工作中,我们提议建立一个学习系统(ISAGrasp),以利用少量的人类演示手段,将大量大型的数据集捆绑起来,其中包含对各种新事物的成功捕捉。我们的关键洞察力是使用一个对等认知的隐含基因模型来变形介质和展示人类掌握,以便产生一套关于新事物的多样化数据集和成功捕捉,供受监督的学习使用,同时保持语义现实主义。我们用这个数据集来在模拟中训练一种强有力的捕捉政策,可以在现实世界中部署。我们展示了四指的阿列格罗手在模拟和现实世界中掌握的性能,并展示这一方法可以处理全新的语义类,在现实世界中捕捉看不见的物体方面达到79%的成功率。