质量- 多样化优化化 (Few-shot Quality-Diversity Optimization)

In the past few years, a considerable amount of research has been dedicated to the exploitation of previous learning experiences and the design of Few-shot and Meta Learning approaches, in problem domains ranging from Computer Vision to Reinforcement Learning based control. A notable exception, where to the best of our knowledge, little to no effort has been made in this direction is Quality-Diversity (QD) optimization. QD methods have been shown to be effective tools in dealing with deceptive minima and sparse rewards in Reinforcement Learning. However, they remain costly due to their reliance on inherently sample inefficient evolutionary processes. We show that, given examples from a task distribution, information about the paths taken by optimization in parameter space can be leveraged to build a prior population, which when used to initialize QD methods in unseen environments, allows for few-shot adaptation. Our proposed method does not require backpropagation. It is simple to implement and scale, and furthermore, it is agnostic to the underlying models that are being trained. Experiments carried in both sparse and dense reward settings using robotic manipulation and navigation benchmarks show that it considerably reduces the number of generations that are required for QD optimization in these environments.

翻译：在过去几年里,大量研究致力于利用以往的学习经验,并设计了从计算机视野到强化学习控制等问题领域的微小和元学习方法,从计算机视野到强化学习控制,这是一个显著的例外,在这方面,我们最了解的情况是,在这方面很少甚至没有作出努力,即质量差异优化。QD方法被证明是处理强化学习中欺骗性微型和微量奖励的有效工具。然而,由于依赖内在抽样的低效率进化过程,这些方法仍然费用高昂。我们从任务分布中可以看出,可以利用在参数空间优化所走的道路的信息来建立先前的人口,而当在不见环境中启动QD方法时,可以进行几度调整。我们提出的方法不需要反向调整,执行和规模比较简单,而且对正在培训的基本模式来说是微不足道的。使用机器人操纵和导航基准在稀有和密集的奖励环境中进行的实验表明,它大大减少了在这些环境中进行QD优化所需的代数。

相关内容

小样本学习

关注 215

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日