The learning efficiency and generalization ability of an intelligent agent can be greatly improved by utilizing a useful set of skills. However, the design of robot skills can often be intractable in real-world applications due to the prohibitive amount of effort and expertise that it requires. In this work, we introduce Skill Learning In Diversified Environments (SLIDE), a method to discover generalizable skills via automated generation of a diverse set of tasks. As opposed to prior work on unsupervised discovery of skills which incentivizes the skills to produce different outcomes in the same environment, our method pairs each skill with a unique task produced by a trainable task generator. To encourage generalizable skills to emerge, our method trains each skill to specialize in the paired task and maximizes the diversity of the generated tasks. A task discriminator defined on the robot behaviors in the generated tasks is jointly trained to estimate the evidence lower bound of the diversity objective. The learned skills can then be composed in a hierarchical reinforcement learning algorithm to solve unseen target tasks. We demonstrate that the proposed method can effectively learn a variety of robot skills in two tabletop manipulation domains. Our results suggest that the learned skills can effectively improve the robot's performance in various unseen target tasks compared to existing reinforcement learning and skill learning methods.
翻译:使用一套有用的技能可以大大提高智能剂的学习效率和普及能力。然而,机器人技能的设计在现实世界应用中往往会由于它所需要的大量努力和专门知识而难以掌握。在这项工作中,我们引入了技能学习多样化环境(SLIDE),这是通过自动生成一系列不同任务来发现一般技能的一种方法。与先前进行的未经监督的发现技能的工作不同,这些技能鼓励在同一环境中产生不同结果的技能,我们的方法将每种技能与由可训练的任务生成器所产生的独特任务相配。为了鼓励通用技能的出现,我们的方法可以培训每一种技能,使其专门从事配对任务,并尽量扩大所产生任务的多样性。我们共同培训了对所产生任务中的机器人行为进行任务歧视的人,以估计多样性目标的较低程度。所学技能随后可以组成等级强化学习算法,以解决看不见的目标任务。我们证明,拟议的方法可以有效地在两个桌面操控领域学习各种机器人技能。我们的成果表明,学习的技能可以有效地改进现有技能,并改进现有技能的方法。