Achieving stable and robust perceptive locomotion for bipedal robots in unstructured outdoor environments remains a critical challenge due to complex terrain geometry and susceptibility to external disturbances. In this work, we propose a novel reward design inspired by the Linear Inverted Pendulum Model (LIPM) to enable perceptive and stable locomotion in the wild. The LIPM provides theoretical guidance for dynamic balance by regulating the center of mass (CoM) height and the torso orientation. These are key factors for terrain-aware locomotion, as they help ensure a stable viewpoint for the robot's camera. Building on this insight, we design a reward function that promotes balance and dynamic stability while encouraging accurate CoM trajectory tracking. To adaptively trade off between velocity tracking and stability, we leverage the Reward Fusion Module (RFM) approach that prioritizes stability when needed. A double-critic architecture is adopted to separately evaluate stability and locomotion objectives, improving training efficiency and robustness. We validate our approach through extensive experiments on a bipedal robot in both simulation and real-world outdoor environments. The results demonstrate superior terrain adaptability, disturbance rejection, and consistent performance across a wide range of speeds and perceptual conditions.
翻译:在非结构化户外环境中,由于复杂的地形几何结构及对外部干扰的敏感性,实现双足机器人稳定且鲁棒的感知运动仍是一项关键挑战。本研究受线性倒立摆模型启发,提出一种新颖的奖励设计方法,以实现野外环境中的感知与稳定运动。LIPM通过调节质心高度与躯干朝向为动态平衡提供理论指导,这两者是地形感知运动的关键因素,有助于确保机器人相机获得稳定的观测视角。基于此洞见,我们设计了一种奖励函数,在促进平衡与动态稳定性的同时,鼓励精确的质心轨迹跟踪。为在速度跟踪与稳定性之间实现自适应权衡,我们采用奖励融合模块方法,在需要时优先保障稳定性。通过采用双评价器架构分别评估稳定性与运动目标,提升了训练效率与鲁棒性。我们在仿真与真实户外环境中对双足机器人进行了大量实验验证。结果表明,该方法在多种速度与感知条件下均展现出卓越的地形适应性、干扰抑制能力及稳定的性能表现。