One of the main problems of evolutionary algorithms is the convergence of the population to local minima. In this paper, we explore techniques that can avoid this problem by encouraging a diverse behavior of the agents through a shared reward system. The rewards are randomly distributed in the environment, and the agents are only rewarded for collecting them first. This leads to an emergence of a novel behavior of the agents. We introduce our approach to the maze problem and compare it to the previously proposed solution, denoted as Novelty Search (Lehman and Stanley, 2011a). We find that our solution leads to an improved performance while being significantly simpler. Building on that, we generalize the problem and apply our approach to a more advanced set of tasks, Atari Games, where we observe a similar performance quality with much less computational power needed.
翻译:进化算法的主要问题之一是人口与当地小型算法的融合。 在本文中,我们探索了可以通过鼓励代理人通过共享奖励制度采取不同行为来避免这一问题的技术。 奖励在环境中随机分配, 代理人只因首先收集而得到奖励。 这导致了代理人的新行为。 我们引入了我们应对迷宫问题的方法, 并将其与先前建议的解决办法( 称为Novellty Search ( Lehman and Stanley, 2011a) 进行比较。 我们发现, 我们的解决方案在大大简化的同时,提高了绩效。 在此基础上,我们推广了这一问题,并将我们的方法应用于更先进的一系列任务,即阿塔里运动会,我们在那里看到类似的性能质量,而所需要的计算能力要少得多。