In this work, we tackle the problem of open-ended learning by introducing a method that simultaneously evolves agents and increasingly challenging environments. Unlike previous open-ended approaches that optimize agents using a fixed neural network topology, we hypothesize that generalization can be improved by allowing agents' controllers to become more complex as they encounter more difficult environments. Our method, Augmentative Topology EPOET (ATEP), extends the Enhanced Paired Open-Ended Trailblazer (EPOET) algorithm by allowing agents to evolve their own neural network structures over time, adding complexity and capacity as necessary. Empirical results demonstrate that ATEP results in general agents capable of solving more environments than a fixed-topology baseline. We also investigate mechanisms for transferring agents between environments and find that a species-based approach further improves the performance and generalization of agents.
翻译:在这项工作中,我们通过引入一种同时演化剂和日益具有挑战性的环境的方法来解决开放式学习问题。与以前采用优化剂使用固定神经网络地形的开放方法不同,我们假设,允许剂控制器在遇到更困难的环境时变得更为复杂,从而可以改进一般化。我们的方法,即增强型地形仪(ATEP),扩展了强化型不限尾酸(EPOET)算法,允许剂随着时间推移发展自己的神经网络结构,增加复杂性和必要的能力。经验性结果表明,ATEP的结果是,一般物剂能够解决比固定式地形基线更多的环境。 我们还调查了在环境之间转移物剂的机制,并发现基于物种的方法可以进一步改善物剂的性能和普遍化。