In this paper, we review the question of which action space is best suited for controlling a real biped robot in combination with Sim2Real training. Position control has been popular as it has been shown to be more sample efficient and intuitive to combine with other planning algorithms. However, for position control gain tuning is required to achieve the best possible policy performance. We show that instead, using a torque-based action space enables task-and-robot agnostic learning with less parameter tuning and mitigates the sim-to-reality gap by taking advantage of torque control's inherent compliance. Also, we accelerate the torque-based-policy training process by pre-training the policy to remain upright by compensating for gravity. The paper showcases the first successful sim-to-real transfer of a torque-based deep reinforcement learning policy on a real human-sized biped robot. The video is available at https://youtu.be/CR6pTS39VRE.
翻译:本文回顾了什么样的行为空间最适合与Sim2Real训练结合控制实际双足机器人。位置控制一直以来备受欢迎,因为已经证明其更加样本有效,并且在与其他规划算法相结合时更加直观。然而,为了实现最佳策略表现,需要进行位置控制增益调整。我们展示了,与位置控制相比,使用以扭矩为基础的行为空间能够实现任务和机器人的不可知学习,并且需要较少的参数调整,并且通过利用扭矩控制的固有柔韧性可以减轻Sim-to-Real转换的差距。此外,我们通过预训练策略来抵消重力的影响,加速了基于扭矩的策略训练过程。本文展示了扭矩为基础的深度强化学习策略在实际人类大小的双足机器人上进行的第一个成功的Sim-to-Real转移学习。视频链接在 https://youtu.be/CR6pTS39VRE。