We present a hierarchical framework that combines model-based control and reinforcement learning (RL) to synthesize robust controllers for a quadruped (the Unitree Laikago). The system consists of a high-level controller that learns to choose from a set of primitives in response to changes in the environment and a low-level controller that utilizes an established control method to robustly execute the primitives. Our framework learns a controller that can adapt to challenging environmental changes on the fly, including novel scenarios not seen during training. The learned controller is up to 85~percent more energy efficient and is more robust compared to baseline methods. We also deploy the controller on a physical robot without any randomization or adaptation scheme.
翻译:我们提出了一个等级框架,将基于模型的控制和强化学习(RL)结合起来,为四倍(Unitele Laikago)合成强力控制器。这个系统包括一个高级控制器,学会根据环境变化从一组原始物中作出选择;一个低级控制器,利用固定的控制方法强力执行原始物。我们的框架学习一个控制器,能够适应对飞行环境变化的挑战,包括训练期间没有看到的新情况。学习的控制器的能源效率高达85~5%,比基线方法更强。我们还将控制器部署在一个物理机器人上,没有任何随机化或调整计划。