Learning performant robot manipulation policies can be challenging due to high-dimensional continuous actions and complex physics-based dynamics. This can be alleviated through intelligent choice of action space. Operational Space Control (OSC) has been used as an effective task-space controller for manipulation. Nonetheless, its strength depends on the underlying modeling fidelity, and is prone to failure when there are modeling errors. In this work, we propose OSC for Adaptation and Robustness (OSCAR), a data-driven variant of OSC that compensates for modeling errors by inferring relevant dynamics parameters from online trajectories. OSCAR decomposes dynamics learning into task-agnostic and task-specific phases, decoupling the dynamics dependencies of the robot and the extrinsics due to its environment. This structure enables robust zero-shot performance under out-of-distribution and rapid adaptation to significant domain shifts through additional finetuning. We evaluate our method on a variety of simulated manipulation problems, and find substantial improvements over an array of controller baselines. For more results and information, please visit https://cremebrule.github.io/oscar-web/.
翻译:由于高维连续行动和复杂的物理动态,学习表现机器人操纵政策可能具有挑战性。这可以通过明智地选择行动空间来缓解。操作空间控制(OSC)已被用作有效的任务空间控制器进行操纵。尽管如此,它的强度取决于基本的模型忠诚度,而且当出现模型错误时容易失败。在这项工作中,我们建议OSC用于适应和强力(OSC),这是OSC的数据驱动变体,它通过从在线轨迹中推断相关动态参数来弥补模型错误。OSCAR将动态学习应用到任务敏感和特定任务阶段,分离机器人和因其环境而产生的外部动力依赖性。这一结构使得在分配外和通过额外微调迅速适应重大领域变化的情况下,能够产生强力的零弹性性能。我们评估了各种模拟操纵问题的方法,并在控制基线上发现重大改进。关于更多结果和信息,请访问 https://cremebrouture.gural/carubio/os。