Developing robust walking controllers for bipedal robots is a challenging endeavor. Traditional model-based locomotion controllers require simplifying assumptions and careful modelling; any small errors can result in unstable control. To address these challenges for bipedal locomotion, we present a model-free reinforcement learning framework for training robust locomotion policies in simulation, which can then be transferred to a real bipedal Cassie robot. To facilitate sim-to-real transfer, domain randomization is used to encourage the policies to learn behaviors that are robust across variations in system dynamics. The learned policies enable Cassie to perform a set of diverse and dynamic behaviors, while also being more robust than traditional controllers and prior learning-based methods that use residual control. We demonstrate this on versatile walking behaviors such as tracking a target walking velocity, walking height, and turning yaw.
翻译:为双翼机器人开发强有力的行走控制器是一项艰巨的任务。 传统的基于模型的移动控制器需要简化假设和仔细建模;任何小错误都可能导致不稳定的控制。 为了应对双型移动器的这些挑战,我们提出了一个无型强化学习框架,用于在模拟中培训稳健的行走政策,然后可以将其转移到真正的双型卡西机器人。为了便于模拟到现实的转移,使用域随机化来鼓励政策学习系统动态各异的强健行为。 学习的政策使卡西能够进行一系列多样和动态的行为,同时比传统的控制器和使用残余控制的先前的学习方法更强大。 我们用多功能的行走方式来展示这一点,比如跟踪目标行走速度、行走高度和旋转的动作。