'Reincarnation' in reinforcement learning has been proposed as a formalisation of reusing prior computation from past experiments when training an agent in an environment. In this paper, we present a brief foray into the paradigm of reincarnation in the multi-agent (MA) context. We consider the case where only some agents are reincarnated, whereas the others are trained from scratch -- selective reincarnation. In the fully-cooperative MA setting with heterogeneous agents, we demonstrate that selective reincarnation can lead to higher returns than training fully from scratch, and faster convergence than training with full reincarnation. However, the choice of which agents to reincarnate in a heterogeneous system is vitally important to the outcome of the training -- in fact, a poor choice can lead to considerably worse results than the alternatives. We argue that a rich field of work exists here, and we hope that our effort catalyses further energy in bringing the topic of reincarnation to the multi-agent realm.
翻译:“转世(reincarnation)”在强化学习中被提出作为重用先前的实验计算来训练一个智能体的形式化方法。在本文中,我们简要探讨了在多智能体(MA)环境中的转世范式。我们考虑只有一些智能体会转世,而其他智能体将从头开始训练 —— 选择性转世。在完全合作的MA设置中,我们展示了选择性转世可以比从头开始训练获得更高的回报,并且比全面转世更快地收敛。然而,在异构系统中选择哪些智能体转世是至关重要的,事实上,一个糟糕的选择会导致比其他方法更糟糕的结果。我们认为这里有丰富的研究领域存在,希望我们的努力可以诱导更多的研究者将转世的主题带到多智能体领域。