In this article, we show that learned policies can be applied to solve legged locomotion control tasks with extensive flight phases, such as those encountered in space exploration. Using an off-the-shelf deep reinforcement learning algorithm, we trained a neural network to control a jumping quadruped robot while solely using its limbs for attitude control. We present tasks of increasing complexity leading to a combination of three-dimensional (re-)orientation and landing locomotion behaviors of a quadruped robot traversing simulated low-gravity celestial bodies. We show that our approach easily generalizes across these tasks and successfully trains policies for each case. Using sim-to-real transfer, we deploy trained policies in the real world on the SpaceBok robot placed on an experimental testbed designed for two-dimensional micro-gravity experiments. The experimental results demonstrate that repetitive, controlled jumping and landing with natural agility is possible.
翻译:在本篇文章中,我们展示了可以应用所学的政策来解决具有广泛飞行阶段的腿动控任务,例如空间探索中遇到的飞行阶段。我们使用现成的深强化学习算法,训练了一个神经网络来控制跳跃四重机器人,而仅仅利用其四肢来控制姿态。我们展示了日益复杂的任务,导致三维(再)方向和着陆动作的结合。我们展示了一种四重机器人在模拟低重力天体上穿行的四重机器人的动作。我们展示了我们的方法很容易概括这些任务,并成功地为每个案例培训了政策。我们利用模拟到现实的转移,在实际世界中部署了经过训练的政策,在为两维微重力实验设计的实验试验床上放置了SpaceBok机器人。实验结果表明,重复、控制跳动和着陆是有可能的,并且具有自然敏捷性。