A fascinating aspect of nature lies in its ability to produce a collection of organisms that are all high-performing in their niche. Quality-Diversity (QD) methods are evolutionary algorithms inspired by this observation, that obtained great results in many applications, from wing design to robot adaptation. Recently, several works demonstrated that these methods could be applied to perform neuro-evolution to solve control problems in large search spaces. In such problems, diversity can be a target in itself. Diversity can also be a way to enhance exploration in tasks exhibiting deceptive reward signals. While the first aspect has been studied in depth in the QD community, the latter remains scarcer in the literature. Exploration is at the heart of several domains trying to solve control problems such as Reinforcement Learning and QD methods are promising candidates to overcome the challenges associated. Therefore, we believe that standardized benchmarks exhibiting control problems in high dimension with exploration difficulties are of interest to the QD community. In this paper, we highlight three candidate benchmarks and explain why they appear relevant for systematic evaluation of QD algorithms. We also provide open-source implementations in Jax allowing practitioners to run fast and numerous experiments on few compute resources.
翻译:自然界的一个迷人方面在于它能够产生一系列在自身位置上表现良好的生物体。质量-多样性(QD)方法是这种观察所启发的进化算法,在从机翼设计到机器人改造等许多应用中都取得了巨大成果。最近,一些工程表明,这些方法可以用来进行神经革命,解决大型搜索空间的控制问题。在这些问题中,多样性本身就是一个目标。多样性也可以是加强对显示欺骗性奖赏信号的任务的探索的一种方式。虽然第一个方面已经在QD社区进行了深入的研究,但后者在文献中仍然比较稀少。探索是试图解决控制问题的若干领域的核心,例如加强学习和QD方法,是克服相关挑战的有希望对象。因此,我们认为,在探索困难的高度方面展示控制问题的标准化基准对于QD社区是有意义的。在这个文件中,我们强调三个候选基准,并解释为什么它们看起来与系统评估QD算法有关。我们还在Jax提供开放源实施,允许从业人员对少数资源进行快速和无数的实验。我们还在Jax提供开放源。