We propose to address quadrupedal locomotion tasks using Reinforcement Learning (RL) with a Transformer-based model that learns to combine proprioceptive information and high-dimensional depth sensor inputs. While learning-based locomotion has made great advances using RL, most methods still rely on domain randomization for training blind agents that generalize to challenging terrains. Our key insight is that proprioceptive states only offer contact measurements for immediate reaction, whereas an agent equipped with visual sensory observations can learn to proactively maneuver environments with obstacles and uneven terrain by anticipating changes in the environment many steps ahead. In this paper, we introduce LocoTransformer, an end-to-end RL method that leverages both proprioceptive states and visual observations for locomotion control. We evaluate our method in challenging simulated environments with different obstacles and uneven terrain. We transfer our learned policy from simulation to a real robot by running it indoors and in the wild with unseen obstacles and terrain. Our method not only significantly improves over baselines, but also achieves far better generalization performance, especially when transferred to the real robot. Our project page with videos is at https://rchalyang.github.io/LocoTransformer/ .
翻译:我们建议用一个基于变压器的模型来解决四重移动任务,用强化学习(RL)解决四重移动任务,该模型将学习自行感知的信息和高维深度传感器投入结合起来。虽然基于学习的移动器使用RL取得了巨大进步,但大多数方法仍然依靠域随机化来训练普通到挑战地形的盲人剂。我们的关键洞察力是,自我感知状态只能提供接触立即反应的测量,而配备视觉感知观测仪的代理器可以通过预测环境未来许多步骤的变化,学习如何以障碍和不均匀地形来积极操控环境。在本文中,我们引入了LocoTranserector(Loco Transfer),这是一种端到端的RLL方法,既利用自我感知状态和视觉观察来控制传动。我们评估我们以不同障碍和不均匀地形挑战模拟环境的方法。我们通过在室内运行,在野外使用看不见的障碍和地形,将我们所学过的政策转移到真正的机器人。我们的方法不仅大大改进基线,而且还能取得远得多的普及性表现,特别是在转移到真正的机器人时。我们的项目在http/Transtravich.