Robotic dexterous grasping is a challenging problem due to the high degree of freedom (DoF) and complex contacts of multi-fingered robotic hands. Existing deep reinforcement learning (DRL) based methods leverage human demonstrations to reduce sample complexity due to the high dimensional action space with dexterous grasping. However, less attention has been paid to hand-object interaction representations for high-level generalization. In this paper, we propose a novel geometric and spatial hand-object interaction representation, named DexRep, to capture dynamic object shape features and the spatial relations between hands and objects during grasping. DexRep comprises Occupancy Feature for rough shapes within sensing range by moving hands, Surface Feature for changing hand-object surface distances, and Local-Geo Feature for local geometric surface features most related to potential contacts. Based on the new representation, we propose a dexterous deep reinforcement learning method to learn a generalizable grasping policy DexRepNet. Experimental results show that our method outperforms baselines using existing representations for robotic grasping dramatically both in grasp success rate and convergence speed. It achieves a 93% grasping success rate on seen objects and higher than 80% grasping success rates on diverse objects of unseen categories in both simulation and real-world experiments.
翻译:机器人的灵巧抓取是一个具有挑战性的问题,由于多指机器人手的高自由度和复杂的接触情况。现有的基于深度强化学习(DRL)的方法通过利用人类示范来减少抓取过程的采样复杂性,因为这个问题中动作空间维度很高。然而,较少关注高级泛化的手-物体交互表示方法。本文提出了一种新颖的基于几何和空间的手-物体交互表示方法,称为DexRepNet,旨在捕捉抓取过程中动态物体形状特征和手与物体之间的空间关系。 DexRepNet包括三个不同部分:占位特征用于表示在感知范围内的粗略物体表面形状,表面特征用于表示生成的手-物表面距离特征,以及局部几何特征用于表示 可能最相关的本地几何表面特征,以形成潜在的接触。在新的表示基础上,本文提出了一种灵巧深度强化学习方法,以学习一个可泛化的抓取策略DexRepNet。实验结果表明,我们的方法在机器人的抓取过程中比现有的表示方法显著优秀,无论是在抓取成功率还是收敛速度上。在模拟环境和现实环境中,它在已知物体上获得了93%的抓取成功率,在未知类别的多样化物体上获得了超过80%的抓取成功率。