神经盖子:通过控制障碍功能和零动态政策学习双向移动 (Neural Gaits: Learning Bipedal Locomotion via Control Barrier Functions and Zero Dynamics Policies)

This work presents Neural Gaits, a method for learning dynamic walking gaits through the enforcement of set invariance that can be refined episodically using experimental data from the robot. We frame walking as a set invariance problem enforceable via control barrier functions (CBFs) defined on the reduced-order dynamics quantifying the underactuated component of the robot: the zero dynamics. Our approach contains two learning modules: one for learning a policy that satisfies the CBF condition, and another for learning a residual dynamics model to refine imperfections of the nominal model. Importantly, learning only over the zero dynamics significantly reduces the dimensionality of the learning problem while using CBFs allows us to still make guarantees for the full-order system. The method is demonstrated experimentally on an underactuated bipedal robot, where we are able to show agile and dynamic locomotion, even with partially unknown dynamics.

翻译：这项工作展示了 Neural Gaits, 这是一种通过执行固定变量来学习动态行走步数的方法, 可以通过使用机器人的实验数据进行精细的缩写。我们把行走设置为通过控制屏障功能( CBFs) 可以执行的固定变量问题。控制屏障功能( CBFs) 定义了缩放的动态动态, 以量化机器人未活化的部件: 零动态。我们的方法包含两个学习模块: 一个是学习符合 CBF 条件的政策, 另一个是学习一个残余动态模型, 以完善标称模型的不完善之处。重要的是, 只有在零动态上学习才能大大降低学习问题的维度, 而使用 CBFs 允许我们仍然为全序系统提供保障。该方法在一个未活化的双臂机器人上进行了实验性演示, 在那里我们可以显示动作灵活和动态移动, 即使是部分未知的动态。