Learning communication via deep reinforcement learning (RL) or imitation learning (IL) has recently been shown to be an effective way to solve Multi-Agent Path Finding (MAPF). However, existing communication based MAPF solvers focus on broadcast communication, where an agent broadcasts its message to all other or predefined agents. It is not only impractical but also leads to redundant information that could even impair the multi-agent cooperation. A succinct communication scheme should learn which information is relevant and influential to each agent's decision making process. To address this problem, we consider a request-reply scenario and propose Decision Causal Communication (DCC), a simple yet efficient model to enable agents to select neighbors to conduct communication during both training and execution. Specifically, a neighbor is determined as relevant and influential only when the presence of this neighbor causes the decision adjustment on the central agent. This judgment is learned only based on agent's local observation and thus suitable for decentralized execution to handle large scale problems. Empirical evaluation in obstacle-rich environment indicates the high success rate with low communication overhead of our method.
翻译:通过深强化学习(RL)或模仿学习(IL)进行学习交流最近证明是解决多代理路径发现的有效方法。然而,现有的基于通信的MAPF解答器侧重于广播通信,即代理人将其信息传递给所有其他或预先界定的代理人,这不仅不切实际,而且会导致多余的信息,甚至可能损害多代理人合作。简洁的通信计划应当了解哪些信息与每个代理人的决策过程相关并具有影响力。为了解决这一问题,我们认为,一种要求回馈的情景,并提议一种简单而有效的决定,即Causal通信(DCC)模式,使代理人能够在培训和执行期间选择邻居进行通信。具体地说,只有当该邻居的存在导致中央代理人的决策调整时,邻居才被确定为相关和有影响力。这一判断只能根据代理人的当地观察,并因此适合分散执行来处理大规模问题。在有障碍的环境下进行经验性评估表明我们的方法在低通信管理下取得了很高的成功率。