We approach the problem of high-DOF reaching-and-grasping via learning joint planning of grasp and motion with deep reinforcement learning. To resolve the sample efficiency issue in learning the high-dimensional and complex control of dexterous grasping, we propose an effective representation of grasping state characterizing the spatial interaction between the gripper and the target object. To represent gripper-object interaction, we adopt Interaction Bisector Surface (IBS) which is the Voronoi diagram between two close by 3D geometric objects and has been successfully applied in characterizing spatial relations between 3D objects. We found that IBS is surprisingly effective as a state representation since it well informs the fine-grained control of each finger with spatial relation against the target object. This novel grasp representation, together with several technical contributions including a fast IBS approximation, a novel vector-based reward and an effective training strategy, facilitate learning a strong control model of high-DOF grasping with good sample efficiency, dynamic adaptability, and cross-category generality. Experiments show that it generates high-quality dexterous grasp for complex shapes with smooth grasping motions.
翻译:我们通过学习联合规划握手和运动,深加学习,解决高DOF的伸展和牵引问题。为了解决在学习高维和复杂控制极速握手时的样本效率问题,我们建议有效代表握手者与目标对象之间的空间互动的掌握状态。为了代表抓手者与目标对象之间的互动,我们采用了互动双部门表(IBS),即Voronoi图,该图介于两个3D几何天体之间,并成功地应用于3D天体之间的空间关系特征化。我们发现IMBS作为一种州代表机构是出乎意料的,因为它很好地掌握了对目标对象的每个手指与空间关系的精细微控制。这个新的抓住代表机构,连同若干技术贡献,包括快速的IBS近似、基于病媒的新奖赏和有效的培训战略,有助于学习高DOF掌握高采样效率、动态适应性和跨类通用的强大控制模式。实验显示,它能够产生高品质的复杂形状的远端掌握平稳掌握。