In order to be effective general purpose machines in real world environments, robots not only will need to adapt their existing manipulation skills to new circumstances, they will need to acquire entirely new skills on-the-fly. A great promise of continual learning is to endow robots with this ability, by using their accumulated knowledge and experience from prior skills. We take a fresh look at this problem, by considering a setting in which the robot is limited to storing that knowledge and experience only in the form of learned skill policies. We show that storing skill policies, careful pre-training, and appropriately choosing when to transfer those skill policies is sufficient to build a continual learner in the context of robotic manipulation. We analyze which conditions are needed to transfer skills in the challenging Meta-World simulation benchmark. Using this analysis, we introduce a pair-wise metric relating skills that allows us to predict the effectiveness of skill transfer between tasks, and use it to reduce the problem of continual learning to curriculum selection. Given an appropriate curriculum, we show how to continually acquire robotic manipulation skills without forgetting, and using far fewer samples than needed to train them from scratch.
翻译:为了在现实世界环境中成为有效的通用机器,机器人不仅需要将其现有的操纵技能适应于新的环境,还需要在飞行时获得全新的技能。 不断学习的一个巨大希望是利用所积累的知识和经验,以这种能力给机器人留下丰富的知识和经验。 我们重新审视这一问题,考虑一个机器人只能以学习技能政策的形式储存这些知识和经验的环境。 我们表明,在机器人操作方面,储存技能政策、仔细的预先训练以及适当选择何时转让这些技能政策就足以培养出不断学习的技能。 我们分析在具有挑战性的元世界模拟基准中,需要哪些条件来传授技能。我们利用这一分析,引入一种双向相关技能,使我们能够预测任务之间技能转移的有效性,并利用这种技术来减少持续学习课程选择的问题。我们通过适当的课程,可以展示如何在不忘记机器人操作技能的情况下不断获得这些技能技能,并且使用比需要的少得多的样本来从零开始训练这些技能。