Spring-based actuators in legged locomotion provide energy-efficiency and improved performance, but increase the difficulty of controller design. While previous work has focused on extensive modeling and simulation to find optimal controllers for such systems, we propose to learn model-free controllers directly on the real robot. In our approach, gaits are first synthesized by central pattern generators (CPGs), whose parameters are optimized to quickly obtain an open-loop controller that achieves efficient locomotion. Then, to make this controller more robust and further improve the performance, we use reinforcement learning to close the loop, to learn corrective actions on top of the CPGs. We evaluate the proposed approach on the DLR elastic quadruped bert. Our results in learning trotting and pronking gaits show that exploitation of the spring actuator dynamics emerges naturally from optimizing for dynamic motions, yielding high-performing locomotion despite being model-free. The whole process takes no more than 1.5 hours on the real robot and results in natural-looking gaits.
翻译:摘要:弹性驱动器在四足动物运动中提供了节能和卓越的性能,但增加了控制器设计的难度。虽然以前的研究侧重于广泛建模和仿真以找到这种系统的最优控制器,但我们建议直接在实际机器人上学习无模型控制器。在我们的方法中,步态首先由中央模式生成器(CPGs)综合,其参数经过优化以快速获取实现高效运动的开环控制器。然后,为了使这个控制器更加稳健并进一步提高性能,我们使用强化学习来关闭环路,学习在CPGs之上的校正动作。我们在DLR弹性四足动物bert上评估了所提出的方法。我们在学习跑步和跳跃步态方面的结果表明,通过优化动态运动来自然地利用弹簧驱动器动力学,尽管无模型控制器,但产生高性能运动。整个过程在实际机器人上不需要超过1.5小时,并产生自然的步态。