Learning-based grasping can afford real-time motion planning of multi-fingered robotics hands thanks to its high computational efficiency. However, it needs to explore large search spaces during its learning process. The search space causes low learning efficiency, which has been the main barrier to its practical adoption. In addition, the generalizability of the trained policy is limited unless they are identical or similar to the trained objects. In this work, we develop a novel Physics-Guided Deep Reinforcement Learning with a Hierarchical Reward Mechanism to improve the learning efficiency and generalizability for learning-based autonomous grasping. Unlike conventional observation-based grasp learning, physics-informed metrics are utilized to convey correlations between features associated with hand structures and objects to improve learning efficiency and outcomes. Further, a hierarchical reward mechanism is developed to enable the robot to learn the grasping task in a prioritized way. It is validated in grasping tasks with a MICO robot arm in both simulation and physical experiments. The results show that our method outperformed the standard Deep Reinforcement learning method in task performance by 48% and learning efficiency by 40%.
翻译:由于计算效率高,以学习为基础的掌握方法能够实时规划多手指机器人手的实时动作规划。然而,它需要在其学习过程中探索大型搜索空间。搜索空间导致学习效率低,这是实际采用该方法的主要障碍。此外,培训政策的普遍性有限,除非与受过训练的对象相同或相似。在这项工作中,我们开发了新型的物理指导深层强化学习,并配有一个等级奖励机制,以提高学习自主掌握的学习效率和通用性。与传统的基于观察的掌握方法不同的是,物理知情的衡量标准被用来传递与手动结构和物体有关的特点之间的相互关系,以提高学习效率和成果。此外,还开发了一个等级奖励机制,使机器人能够以优先的方式学习掌握掌握的任务。在模拟和物理实验中,与MICO机器人的机械臂一起掌握的任务得到了验证。结果显示,我们的方法比基于学习的自主掌握标准深度强化学习方法高出了48%,而学习效率则超过40%。