Recent advances in legged locomotion have enabled quadrupeds to walk on challenging terrains. However, bipedal robots are inherently more unstable and hence it's harder to design walking controllers for them. In this work, we leverage recent advances in rapid adaptation for locomotion control, and extend them to work on bipedal robots. Similar to existing works, we start with a base policy which produces actions while taking as input an estimated extrinsics vector from an adaptation module. This extrinsics vector contains information about the environment and enables the walking controller to rapidly adapt online. However, the extrinsics estimator could be imperfect, which might lead to poor performance of the base policy which expects a perfect estimator. In this paper, we propose A-RMA (Adapting RMA), which additionally adapts the base policy for the imperfect extrinsics estimator by finetuning it using model-free RL. We demonstrate that A-RMA outperforms a number of RL-based baseline controllers and model-based controllers in simulation, and show zero-shot deployment of a single A-RMA policy to enable a bipedal robot, Cassie, to walk in a variety of different scenarios in the real world beyond what it has seen during training. Videos and results at https://ashish-kmr.github.io/a-rma/
翻译:脚踏脚踏实地的最近进步使四倍的机器人得以在充满挑战的地形上行走。 然而, 两肢机器人本身就更加不稳定, 因而更难设计行走控制器。 在这项工作中, 我们利用最近对行走控制器的快速适应进展, 并将其推广到双肢机器人的工作上。 和现有工作一样, 我们从一个基础政策开始, 产生行动, 同时从一个适应模块输入一个估计的外部矢量作为输入。 这个边际矢量含有关于环境的信息, 使行走控制器能够迅速在网上适应。 但是, 外部天花板可能不完善, 这可能导致基础政策业绩不佳, 期望一个完美的天花板控制器。 在本文中, 我们提出A- RMA (Adapting RMA) (A-RMA) (Adapting) (A-RMA) (A-RMA) (A- Adir- Ar- Adio) (在模拟/ Adrobal- Adal- Aview Adal Adal Adal) 中, 在模拟中, 将一个单一的A- sal- disal- sal- pres- sal- aview A/ appem- sal- salview) 上, 在模拟、一个不同的模拟/ a- sal- sal- sal- sal- salview- sal- sal- sal- sal- sal- sal- sal- sal- sal- salviewdviewdalviewdal- sma 上, 在模拟/ app 的模拟/ app 中, 在模拟/ appimfistrubal- sess 上如何的模拟中, 在模拟中, 在模拟中, 在模拟/ appd- apps- adal- app 中, 在模拟/ appim- salviewdalvial- sal- salviewdal- sess 上展示中, 在模拟中, 在模拟中, 在模拟中展示中展示中, 在模拟/ a- s- s- a- sal- a- s- s- s- s- a- a- a- s- s-