Recent studies have shown that introducing communication between agents can significantly improve overall performance in cooperative Multi-agent reinforcement learning (MARL). In many real-world scenarios, communication can be expensive and the bandwidth of the multi-agent system is subject to certain constraints. Redundant messages who occupy the communication resources can block the transmission of informative messages and thus jeopardize the performance. In this paper, we aim to learn the minimal sufficient communication messages. First, we initiate the communication between agents by a complete graph. Then we introduce the graph information bottleneck (GIB) principle into this complete graph and derive the optimization over graph structures. Based on the optimization, a novel multi-agent communication module, called CommGIB, is proposed, which effectively compresses the structure information and node information in the communication graph to deal with bandwidth-constrained settings. Extensive experiments in Traffic Control and StanCraft II are conducted. The results indicate that the proposed methods can achieve better performance in bandwidth-restricted settings compared with state-of-the-art algorithms, with especially large margins in large-scale multi-agent tasks.
翻译:最近的研究显示,在多剂强化合作学习(MARL)中,采用代理商之间的通信可以大大改善合作性多剂强化学习的总体绩效。在许多现实世界情景中,通信费用昂贵,多剂系统的带宽受到某些限制。占用通信资源的多余信息可以阻断信息传递,从而危及性能。在本文中,我们的目标是通过一个完整的图表来学习最低限度的充分通信信息。首先,我们通过一个完整的图表来启动代理商之间的通信。然后,我们在这个完整的图表中引入图形信息瓶颈原则,并在图形结构上进行优化。在优化的基础上,提出了一个新的多剂通信模块,称为CommGIB,它有效地压缩结构信息和通信图中的节点信息,以应对带宽限制的环境。在交通控制和斯坦克拉夫二号上进行了广泛的实验。结果显示,拟议的方法可以在带宽度限制环境中实现更好的性能,与最先进的算法相比,在大型多剂任务中特别有较大的利润。