CPG 基于CPG的 Agile 和 Versatile Locomotion 框架,使用近似对称损失 (A CPG-Based Agile and Versatile Locomotion Framework Using Proximal Symmetry Loss)

Humanoid robots are made to resemble humans but their locomotion abilities are far from ours in terms of agility and versatility. When humans walk on complex terrains, or face external disturbances, they combine a set of strategies, unconsciously and efficiently, to regain stability. This paper tackles the problem of developing a robust omnidirectional walking framework, which is able to generate versatile and agile locomotion on complex terrains. The Linear Inverted Pendulum Model and Central Pattern Generator concepts are used to develop a closed-loop walk engine, which is then combined with a reinforcement learning module. This module learns to regulate the walk engine parameters adaptively, and generates residuals to adjust the robot's target joint positions (residual physics). Additionally, we propose a proximal symmetry loss function to increase the sample efficiency of the Proximal Policy Optimization algorithm, by leveraging model symmetries and the trust region concept. The effectiveness of the proposed framework was demonstrated and evaluated across a set of challenging simulation scenarios. The robot was able to generalize what it learned in unforeseen circumstances, displaying human-like locomotion skills, even in the presence of noise and external pushes.

翻译：人类机器人被制造成像人类一样的机器人,但是它们的移动能力在灵活性和多功能性方面远离我们远。当人类在复杂地形上行走或面临外部扰动时,它们会把一套无意识和高效的战略结合起来,以恢复稳定。本文处理的是开发一个强大的全向行走框架的问题,这个框架能够在复杂地形上产生多功能和灵活机动的移动动作。线形反转的中腰部模型和中央型发电机概念被用来开发一个闭路行走引擎,然后与一个强化学习模块相结合。当人类在复杂的地形上行走时,或者面对外部扰动时,它们会学会调整行走引擎参数,并产生剩余部分来调整机器人的目标联合位置(静态物理学 ) 。此外,我们提出一个准对称性对称性损失功能,通过利用模型对称和信任区域概念,提高准性政策优化算法的样本效率。在一系列具有挑战性的模拟假想中展示和评估了拟议框架的有效性。机器人能够将它在不可预测环境中学到的东西加以概括化,在不可预见的环境中展示和展示。