Quality-Diversity (QD) algorithms have recently gained traction as optimisation methods due to their effectiveness at escaping local optima and capability of generating wide-ranging and high-performing solutions. Recently, Multi-Objective MAP-Elites (MOME) extended the QD paradigm to the multi-objective setting by maintaining a Pareto front in each cell of a map-elites grid. MOME achieved a global performance that competed with NSGA-II and SPEA2, two well-established Multi-Objective Evolutionary Algorithms (MOEA), while also acquiring a diverse repertoire of solutions. However, MOME is limited by non-directed genetic search mechanisms which struggle in high-dimensional search spaces. In this work, we present Multi-Objective MAP-Elites with Policy-Gradient Assistance and Crowding-based Exploration (MOME-PGX): a new QD algorithm that extends MOME to improve its data efficiency and performance. MOME-PGX uses gradient-based optimisation to efficiently drive solutions towards higher performance. It also introduces crowding-based mechanisms to create an improved exploration strategy and to encourage uniformity across Pareto fronts. We evaluate MOME-PGX in four simulated robot locomotion tasks and demonstrate that it converges faster and to a higher performance than all other baselines. We show that MOME-PGX is between 4.3 and 42 times more data-efficient than MOME and doubles the performance of MOME, NSGA-II and SPEA2 in challenging environments.
翻译:质量差异(QD)算法最近作为优化方法而获得了吸引力,因为它们在逃离当地选择方面的效力以及产生广泛和高效解决方案的能力。最近,多目标MAP-Elites(MOME)将QD范式扩大到多目标设置,在地图-精英网格的每个单元格中都保留了Pareto前端。MOME(MOME-PGX):一种新的QD算法与NSGA-II和SPEA2相竞争,它与NSGA-II和SP2相竞争,它有两种成熟的多目标性能进化高性能,同时获得多样化的解决方案组合。然而,MOME(MOEA)受到非定向基因搜索机制的限制,在高度搜索空间中挣扎。在这项工作中,我们向政策高级援助和基于SP-PGAX(MOGX):一种新的QD算法,将MOME(MOA)的双向数据效率和性能提高性能效率,而MOE-POX(MA)则向更快速地展示了更高级的绩效战略。</s>