We propose a model enabling decentralized multiple agents to share their perception of environment in a fair and adaptive way. In our model, both the current message and historical observation are taken into account, and they are handled in the same recurrent model but in different forms. We present a dual-level recurrent communication framework for multi-agent systems, in which the first recurrence occurs in the communication sequence and is used to transmit communication data among agents, while the second recurrence is based on the time sequence and combines the historical observations for each agent. The developed communication flow separates communication messages from memories but allows agents to share their historical observations by the dual-level recurrence. This design makes agents adapt to changeable communication objects, while the communication results are fair to these agents. We provide a sufficient discussion about our method in both partially observable and fully observable environments. The results of several experiments suggest our method outperforms the existing decentralized communication frameworks and the corresponding centralized training method.
翻译:我们提出了一个模式,使分散的多个代理商能够以公平和适应的方式分享对环境的看法。在我们的模型中,当前的信息和历史观察都得到考虑,它们以同样的经常性模式处理,但以不同的形式处理。我们为多个代理系统提出了一个双重的经常性通信框架,其中第一次重复发生在通信序列中,用于在代理商之间传输通信数据,而第二次重复则以时间序列为基础,并结合每个代理商的历史观察。发达的通信流动将通信信息与记忆分离,但允许代理商通过双层的重现分享其历史观察。这种设计使代理商适应可改变的通信对象,而通信结果对这些代理商是公平的。我们充分讨论了我们在部分可观测和完全可观测环境中采用的方法。一些实验的结果表明,我们的方法超越了现有的分散通信框架和相应的集中培训方法。