This paper details our winning submission to Phase 1 of the 2021 Real Robot Challenge; a challenge in which a three-fingered robot must carry a cube along specified goal trajectories. To solve Phase 1, we use a pure reinforcement learning approach which requires minimal expert knowledge of the robotic system, or of robotic grasping in general. A sparse, goal-based reward is employed in conjunction with Hindsight Experience Replay to teach the control policy to move the cube to the desired x and y coordinates of the goal. Simultaneously, a dense distance-based reward is employed to teach the policy to lift the cube to the z coordinate (the height component) of the goal. The policy is trained in simulation with domain randomisation before being transferred to the real robot for evaluation. Although performance tends to worsen after this transfer, our best policy can successfully lift the real cube along goal trajectories via an effective pinching grasp. Our approach outperforms all other submissions, including those leveraging more traditional robotic control techniques, and is the first pure learning-based method to solve this challenge.
翻译:本文详细介绍了我们向2021年真正的机器人挑战第一阶段提交的成功呈件;三指机器人必须随特定目标轨迹携带立方体的挑战。为了解决第一阶段,我们使用了纯强化学习方法,这要求对机器人系统或一般的机器人捕捉进行最低限度的专家知识。与Hindsight 经验重播一起使用一个分散的、基于目标的奖励来教授控制政策,将立方体移动到理想的x和目标的 Y 坐标。同时,使用密集的远程奖励来教授将立方体提升到目标的 Z 坐标(高度部分)的政策。该政策在被转移到真正的机器人评估之前,先进行域随机化模拟培训。尽管在这种传输后,我们的最佳政策往往会恶化,但通过有效的抓捕,我们的最佳政策可以成功地将真正的立方体沿着目标轨迹提升到。我们的方法比所有其他提交的文件都好,包括利用更传统的机器人控制技术,并且是解决这一挑战的第一个纯粹的学习方法。