Multi-agent reinforcement learning (MARL) has achieved great progress in cooperative tasks in recent years. However, in the local reward scheme, where only local rewards for each agent are given without global rewards shared by all the agents, traditional MARL algorithms lack sufficient consideration of agents' mutual influence. In cooperative tasks, agents' mutual influence is especially important since agents are supposed to coordinate to achieve better performance. In this paper, we propose a novel algorithm Mutual-Help-based MARL (MH-MARL) to instruct agents to help each other in order to promote cooperation. MH-MARL utilizes an expected action module to generate expected other agents' actions for each particular agent. Then, the expected actions are delivered to other agents for selective imitation during training. Experimental results show that MH-MARL improves the performance of MARL both in success rate and cumulative reward.
翻译:近年来,多剂强化学习(MARL)在合作任务方面取得了巨大进展,然而,在地方奖励计划中,只有每个代理人得到当地奖励,而没有所有代理人分享全球奖励,传统的MARL算法没有充分考虑到代理人的相互影响;在合作任务中,代理人的相互影响特别重要,因为代理人应该进行协调,以取得更好的业绩;在本文件中,我们提出一个新的基于互助的MARL(MH-MARL)算法(MH-MARL),指示代理人相互帮助,以促进合作;MH-MARL利用预期的行动模块为每个代理人产生预期的其他代理人的行动;然后,预期的行动在培训期间提供给其他代理人进行选择性的模仿;实验结果显示,MH-MARL在成功率和累积奖励方面提高了MARL的业绩。