Legged robots have enormous potential in their range of capabilities, from navigating unstructured terrains to high-speed running. However, designing robust controllers for highly agile dynamic motions remains a substantial challenge for roboticists. Reinforcement learning (RL) offers a promising data-driven approach for automatically training such controllers. However, exploration in these high-dimensional, underactuated systems remains a significant hurdle for enabling legged robots to learn performant, naturalistic, and versatile agility skills. We propose a framework for training complex robotic skills by transferring experience from existing controllers to jumpstart learning new tasks. To leverage controllers we can acquire in practice, we design this framework to be flexible in terms of their source -- that is, the controllers may have been optimized for a different objective under different dynamics, or may require different knowledge of the surroundings -- and thus may be highly suboptimal for the target task. We show that our method enables learning complex agile jumping behaviors, navigating to goal locations while walking on hind legs, and adapting to new environments. We also demonstrate that the agile behaviors learned in this way are graceful and safe enough to deploy in the real world.
翻译:四脚机器人在其各种能力方面具有巨大的潜力,从遍历非结构化地形到高速奔跑。然而,为高度敏捷的动态运动设计强健的控制器仍然是机器人专业人员面临的重要挑战。强化学习(RL)为自动训练这些控制器提供了一种有前途的数据驱动方法。然而,在这些高维度,欠驱动系统中进行探索仍然是启动学习新任务的重大障碍,以便教导四脚机器人学习出色、自然且多才多艺的灵活性技能。我们提出了一个训练复杂机器人技能的框架,通过从现有控制器中转移经验来启动新任务的学习。为了利用我们在实践中可以获得的控制器,我们设计了这个框架来在源方面具有灵活性——即控制器可能为不同的动态下的不同目标进行了优化,或者可能需要不同的围绕着环境的知识——因此可能对目标任务来说极不优。我们展示了我们的方法可以使四脚机器人学习复杂的敏捷跳跃行为,沿着后腿行走到目标位置,并适应新环境。我们还展示了这种方法学习的敏捷行为足够优雅和安全,可以在真实世界中使用。