Quality-Diversity (QD) optimisation is a new family of learning algorithms that aims at generating collections of diverse and high-performing solutions. Among those algorithms, the recently introduced Covariance Matrix Adaptation MAP-Elites (CMA-ME) algorithm proposes the concept of emitters, which uses a predefined heuristic to drive the algorithm's exploration. This algorithm was shown to outperform MAP-Elites, a popular QD algorithm that has demonstrated promising results in numerous applications. In this paper, we introduce Multi-Emitter MAP-Elites (ME-MAP-Elites), an algorithm that directly extends CMA-ME and improves its quality, diversity and data efficiency. It leverages the diversity of a heterogeneous set of emitters, in which each emitter type improves the optimisation process in different ways. A bandit algorithm dynamically finds the best selection of emitters depending on the current situation. We evaluate the performance of ME-MAP-Elites on six tasks, ranging from standard optimisation problems (in 100 dimensions) to complex locomotion tasks in robotics. Our comparisons against CMA-ME and MAP-Elites show that ME-MAP-Elites is faster at providing collections of solutions that are significantly more diverse and higher performing. Moreover, in cases where no fruitful synergy can be found between the different emitters, ME-MAP-Elites is equivalent to the best of the compared algorithms.
翻译:质量差异优化( QD) 优化是一个新的学习算法大家庭, 旨在收集多种和高效解决方案。 在这些算法中, 最近引入的“ 差异矩阵” 适应 MAP- Elites (CMA- ME) 算法提出了排放者的概念, 使用预先定义的超常法来驱动算法的探索。 这个算法显示优于 MAP- Elites, 一种受欢迎的 QD 算法, 在许多应用程序中显示了有希望的结果。 在本文中, 我们引入了多 Emitter MAP- Elites( ME- MAP- Elites), 这是一种直接扩展 CMA-ME 的算法, 提高了其质量、 多样性和数据效率。 它利用了排放者组合的多样性, 从而以不同的方式改进了该算法的优化进程。 土匪算法根据当前情况, 能够发现最优选择排放者的最佳选择。 我们评估了 MAP- Elites 的六个任务的绩效, 从标准优化问题( 在100 维度上), 直接扩展 CMA- 的精度, 更复杂的移动比较, 在 MAP- 进行更快速分析中, 最多样化的解的解算法中, 进行更快速比较。