Robotic dexterous grasping is a challenging problem due to the high degree of freedom (DoF) and complex contacts of multi-fingered robotic hands. Existing deep reinforcement learning (DRL) based methods leverage human demonstrations to reduce sample complexity due to the high dimensional action space with dexterous grasping. However, less attention has been paid to hand-object interaction representations for high-level generalization. In this paper, we propose a novel geometric and spatial hand-object interaction representation, named DexRep, to capture dynamic object shape features and the spatial relations between hands and objects during grasping. DexRep comprises Occupancy Feature for rough shapes within sensing range by moving hands, Surface Feature for changing hand-object surface distances, and Local-Geo Feature for local geometric surface features most related to potential contacts. Based on the new representation, we propose a dexterous deep reinforcement learning method to learn a generalizable grasping policy DexRepNet. Experimental results show that our method outperforms baselines using existing representations for robotic grasping dramatically both in grasp success rate and convergence speed. It achieves a 93\% grasping success rate on seen objects and higher than 80\% grasping success rates on diverse objects of unseen categories in both simulation and real-world experiments.
翻译:机器人手部物体的全方位抓取是一个具有挑战性的问题,由于多指机器手的自由度高和接触面复杂。现有的深度强化学习(DRL)方法利用人类的演示来降低因复杂的抓取动作空间而导致的样本复杂性。然而,在高级别的泛化方面,手-物体相互作用表示却没有受到足够的关注。在本文中,我们提出了一种新颖的几何和空间的手-物体交互表示——DexRep,用于捕捉抓取过程中物体的动态形状特征以及手和物体之间的空间关系。DexRep由三部分构成:占据特征,用于描述移动手时感知范围内物体的粗糙形状;表面特征,用于描述手与物体表面之间的距离变化;局部几何特征,用于描述与潜在接触面最相关的局部几何形状特征。基于这种新的表示方法,我们提出了一种基于DRL的抓取策略学习方法DexRepNet。实验结果显示,我们的方法在抓取成功率和收敛速度方面都明显优于使用现有表示方法的基线。在仿真和实际世界的实验中,它在看到的对象上达到了93%的抓取成功率,并在未见过的不同种类的对象上获得了80%以上的抓取成功率。