通过自动产生不同任务来发现通用技能 (Discovering Generalizable Skills via Automated Generation of Diverse Tasks)

The learning efficiency and generalization ability of an intelligent agent can be greatly improved by utilizing a useful set of skills. However, the design of robot skills can often be intractable in real-world applications due to the prohibitive amount of effort and expertise that it requires. In this work, we introduce Skill Learning In Diversified Environments (SLIDE), a method to discover generalizable skills via automated generation of a diverse set of tasks. As opposed to prior work on unsupervised discovery of skills which incentivizes the skills to produce different outcomes in the same environment, our method pairs each skill with a unique task produced by a trainable task generator. To encourage generalizable skills to emerge, our method trains each skill to specialize in the paired task and maximizes the diversity of the generated tasks. A task discriminator defined on the robot behaviors in the generated tasks is jointly trained to estimate the evidence lower bound of the diversity objective. The learned skills can then be composed in a hierarchical reinforcement learning algorithm to solve unseen target tasks. We demonstrate that the proposed method can effectively learn a variety of robot skills in two tabletop manipulation domains. Our results suggest that the learned skills can effectively improve the robot's performance in various unseen target tasks compared to existing reinforcement learning and skill learning methods.

翻译：使用一套有用的技能可以大大提高智能剂的学习效率和普及能力。然而,机器人技能的设计在现实世界应用中往往会由于它所需要的大量努力和专门知识而难以掌握。在这项工作中,我们引入了技能学习多样化环境(SLIDE),这是通过自动生成一系列不同任务来发现一般技能的一种方法。与先前进行的未经监督的发现技能的工作不同,这些技能鼓励在同一环境中产生不同结果的技能,我们的方法将每种技能与由可训练的任务生成器所产生的独特任务相配。为了鼓励通用技能的出现,我们的方法可以培训每一种技能,使其专门从事配对任务,并尽量扩大所产生任务的多样性。我们共同培训了对所产生任务中的机器人行为进行任务歧视的人,以估计多样性目标的较低程度。所学技能随后可以组成等级强化学习算法,以解决看不见的目标任务。我们证明,拟议的方法可以有效地在两个桌面操控领域学习各种机器人技能。我们的成果表明,学习的技能可以有效地改进现有技能,并改进现有技能的方法。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【斯坦福&Facebook】生成式对抗变换器，Generative Adversarial Transformers

专知会员服务

21+阅读 · 2021年4月21日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

生成式对抗网络先验贝叶斯推断，Bayesian Inference with Generative Adversarial Network Priors

专知会员服务

28+阅读 · 2020年2月18日