Learning controllers that reproduce legged locomotion in nature has been a long-time goal in robotics and computer graphics. While yielding promising results, recent approaches are not yet flexible enough to be applicable to legged systems of different morphologies. This is partly because they often rely on precise motion capture references or elaborate learning environments that ensure the naturality of the emergent locomotion gaits but prevent generalization. This work proposes a generic approach for ensuring realism in locomotion by guiding the learning process with the spring-loaded inverted pendulum model as a reference. Leveraging on the exploration capacities of Reinforcement Learning (RL), we learn a control policy that fills in the information gap between the template model and full-body dynamics required to maintain stable and periodic locomotion. The proposed approach can be applied to robots of different sizes and morphologies and adapted to any RL technique and control architecture. We present experimental results showing that even in a model-free setup and with a simple reactive control architecture, the learned policies can generate realistic and energy-efficient locomotion gaits for a bipedal and a quadrupedal robot. And most importantly, this is achieved without using motion capture, strong constraints in the dynamics or kinematics of the robot, nor prescribing limb coordination. We provide supplemental videos for qualitative analysis of the naturality of the learned gaits.
翻译:在机器人和计算机图形中,复制自然的腿动动的学习控制器是一个长期的目标。在产生有希望的结果的同时,最近的方法还没有足够灵活到足以适用于不同形态的梯子系统。这部分原因是,它们往往依赖精确的运动抓取参考或精心设计学习环境,以确保突发的运动动动动动作动作的自然性,防止一般化。这项工作提出了一种通用的方法,用春季倒转的钟式控制模型来指导学习过程,以此确保运动的实际情况。在加强学习(RL)的探索能力方面,我们学习了一种控制政策,填补模板模型模型和保持稳定和定期移动所需的全体动态之间的信息差距。提议的方法可以适用于不同尺寸和形态的机器人,并适应任何RL技术和控制结构。我们提出一个实验结果,表明即使在没有模型的设置和简单的被动反应控制结构中,学习的政策可以产生现实和节能的旋转定位格,而我们没有在最坚实的机动动作和最坚固的机动的机动动作分析中,我们所学到的机动的机动和机动性机动性机动的机动性分析,我们所学会的机动的机动的机动的机动和机动的机动的机动的机动的机动的机动的机动性分析是的机动。