Spring-based actuators in legged locomotion provide energy-efficiency and improved performance, but increase the difficulty of controller design. Whereas previous works have focused on extensive modeling and simulation to find optimal controllers for such systems, we propose to learn model-free controllers directly on the real robot. In our approach, gaits are first synthesized by central pattern generators (CPGs), whose parameters are optimized to quickly obtain an open-loop controller that achieves efficient locomotion. Then, to make that controller more robust and further improve the performance, we use reinforcement learning to close the loop, to learn corrective actions on top of the CPGs. We evaluate the proposed approach in DLR's elastic quadruped bert. Our results in learning trotting and pronking gaits show that exploitation of the spring actuator dynamics emerges naturally from optimizing for dynamic motions, yielding high-performing locomotion despite being model-free. The whole process takes no more than 1.5 hours on the real robot and results in natural-looking gaits.
翻译:以弹簧为主的动画器在脚踏式移动中提供节能并改进性能,但增加了控制器设计的困难。先前的工程侧重于广泛的建模和模拟,以便为这些系统找到最佳控制器,我们建议直接在真正的机器人上学习无型控制器。在我们的方法中,台词首先由中央型动画发电机(CPGs)合成,其参数得到优化,以迅速获得一个能够实现高效移动的开关控制器。然后,为了使控制器更强大并进一步改进性能,我们利用强化学习来关闭循环,在CPGs上方学习纠正行动。我们评价DLR弹性四重贝特的拟议方法。我们在学习振动和预动画上的结果显示,弹动动动动动动动动的利用自然自然产生效果,尽管没有模型,但仍产生高性能定位。整个过程在真正的机器人上不会超过1.5小时,自然图的座戏中也不会产生结果。