Over recent years, deep reinforcement learning has shown strong successes in complex single-agent tasks, and more recently this approach has also been applied to multi-agent domains. In this paper, we propose a novel approach, called MAGNet, to multi-agent reinforcement learning that utilizes a relevance graph representation of the environment obtained by a self-attention mechanism, and a message-generation technique. We applied our MAGnet approach to the synthetic predator-prey multi-agent environment and the Pommerman game and the results show that it significantly outperforms state-of-the-art MARL solutions, including Multi-agent Deep Q-Networks (MADQN), Multi-agent Deep Deterministic Policy Gradient (MADDPG), and QMIX
翻译:近年来,深入强化学习在复杂的单一试剂任务中取得了巨大成功,最近,这一方法也应用到多个试剂领域。在本文件中,我们提出一种叫MAGNet的新颖方法,用于多试剂强化学习,利用一个自留机制获得的环境的相关性图示和一种信息生成技术。我们对合成食肉动物-葡萄多剂环境和Pommerman游戏应用了我们的MAGnet方法,结果显示该方法大大超过了最新的MARL解决方案,其中包括多试剂深Q网络(MADQN)、多试剂深阻力政策梯度(MADPG)和QMIX。