Imitation learning offers a promising path for robots to learn general-purpose behaviors, but traditionally has exhibited limited scalability due to high data supervision requirements and brittle generalization. Inspired by recent advances in multi-task imitation learning, we investigate the use of prior data from previous tasks to facilitate learning novel tasks in a robust, data-efficient manner. To make effective use of the prior data, the robot must internalize knowledge from past experiences and contextualize this knowledge in novel tasks. To that end, we develop a skill-based imitation learning framework that extracts temporally extended sensorimotor skills from prior data and subsequently learns a policy for the target task that invokes these learned skills. We identify several key design choices that significantly improve performance on novel tasks, namely representation learning objectives to enable more predictable skill representations and a retrieval-based data augmentation mechanism to increase the scope of supervision for policy training. On a collection of simulated and real-world manipulation domains, we demonstrate that our method significantly outperforms existing imitation learning and offline reinforcement learning approaches. Videos and code are available at https://ut-austin-rpl.github.io/sailor
翻译:模拟学习为机器人学习一般用途行为提供了一条充满希望的道路,但传统上由于数据监管要求高和简单化,其可扩展性有限。受最近多任务模仿学习进展的启发,我们调查了以前工作中的数据使用情况,以便利以稳健、数据效率高的方式学习新任务。为了有效利用先前的数据,机器人必须吸收以往经验的知识,并在新任务中将这种知识背景化。为此,我们开发了一个基于技能的模拟学习框架,从先前的数据中提取时间延伸的感官模范技能,随后学习了一种能够援引这些所学到技能的目标任务的政策。我们确定了若干关键设计选择,大大改进了新任务的业绩,即代表学习目标,以便能够更可预测的技能表现和一个基于检索的数据增强机制,以扩大政策培训的监督范围。在收集模拟和现实世界的操纵领域时,我们证明我们的方法大大超过了现有的模仿学习和离线强化方法。视频和代码可以在 https://ut-saustin-pl.givathio上查阅。