This work introduces a hierarchical strategy for terrain-aware bipedal locomotion that integrates reduced-dimensional perceptual representations to enhance reinforcement learning (RL)-based high-level (HL) policies for real-time gait generation. Unlike end-to-end approaches, our framework leverages latent terrain encodings via a Convolutional Variational Autoencoder (CNN-VAE) alongside reduced-order robot dynamics, optimizing the locomotion decision process with a compact state. We systematically analyze the impact of latent space dimensionality on learning efficiency and policy robustness. Additionally, we extend our method to be history-aware, incorporating sequences of recent terrain observations into the latent representation to improve robustness. To address real-world feasibility, we introduce a distillation method to learn the latent representation directly from depth camera images and provide preliminary hardware validation by comparing simulated and real sensor data. We further validate our framework using the high-fidelity Agility Robotics (AR) simulator, incorporating realistic sensor noise, state estimation, and actuator dynamics. The results confirm the robustness and adaptability of our method, underscoring its potential for hardware deployment.
翻译:本研究提出了一种地形感知双足运动的分层策略,通过集成降维感知表征来增强基于强化学习(RL)的高层(HL)策略,以实现实时步态生成。与端到端方法不同,我们的框架利用卷积变分自编码器(CNN-VAE)生成潜在地形编码,并结合降阶机器人动力学,以紧凑状态优化运动决策过程。我们系统分析了潜在空间维度对学习效率和策略鲁棒性的影响。此外,我们将该方法扩展为历史感知型,将近期地形观测序列纳入潜在表征以提高鲁棒性。针对实际可行性,我们引入了一种蒸馏方法,直接从深度相机图像中学习潜在表征,并通过比较仿真与真实传感器数据提供初步硬件验证。我们进一步使用高保真Agility Robotics(AR)仿真器验证框架,其中包含真实的传感器噪声、状态估计和执行器动力学。结果证实了该方法的鲁棒性和适应性,凸显了其硬件部署潜力。