Adapting a Reinforcement Learning (RL) agent to an unseen environment is a difficult task due to typical over-fitting on the training environment. RL agents are often capable of solving environments very close to the trained environment, but when environments become substantially different, their performance quickly drops. When agents are retrained on new environments, a second issue arises: there is a risk of catastrophic forgetting, where the performance on previously seen environments is seriously hampered. This paper proposes a novel approach that exploits an eco-system of agents to address both concerns. Hereby, the (limited) adaptive power of individual agents is harvested to build a highly adaptive eco-system.
翻译:将强化学习(RL)剂适应于一个看不见的环境是一项艰巨的任务,因为典型的培训环境过于适合。RL剂往往能够解决环境非常接近培训环境的环境,但当环境变得大不相同时,其性能就会迅速下降。当代理物在新的环境中接受再培训时,第二个问题是:存在灾难性的遗忘风险,在以前所见环境中的表现严重受阻。本文提出了一种新颖的办法,利用一个生物剂的生态系统来解决这两个问题。在这里,单个代理物的(有限的)适应力被收获,以建立一个高度适应性的生态系统。