Sim-to-real is a mainstream method to cope with the large number of trials needed by typical deep reinforcement learning methods. However, transferring a policy trained in simulation to actual hardware remains an open challenge due to the reality gap. In particular, the characteristics of actuators in legged robots have a considerable influence on sim-to-real transfer. There are two challenges: 1) High reduction ratio gears are widely used in actuators, and the reality gap issue becomes especially pronounced when backdrivability is considered in controlling joints compliantly. 2) The difficulty in achieving stable bipedal locomotion causes typical system identification methods to fail to sufficiently transfer the policy. For these two challenges, we propose 1) a new simulation model of gears and 2) a method for system identification that can utilize failed attempts. The method's effectiveness is verified using a biped robot, the ROBOTIS-OP3, and the sim-to-real transferred policy can stabilize the robot under severe disturbances and walk on uneven surfaces without using force and torque sensors.
翻译:暂无翻译