多试剂MDP问题中的最佳通信和控制战略 (Optimal communication and control strategies in a multi-agent MDP problem)

The problem of controlling multi-agent systems under different models of information sharing among agents has received significant attention in the recent literature. In this paper, we consider a setup where rather than committing to a fixed information sharing protocol (e.g. periodic sharing or no sharing etc), agents can dynamically decide at each time step whether to share information with each other and incur the resulting communication cost. This setup requires a joint design of agents' communication and control strategies in order to optimize the trade-off between communication costs and control objective. We first show that agents can ignore a big part of their private information without compromising the system performance. We then provide a common information approach based solution for the strategy optimization problem. This approach relies on constructing a fictitious POMDP whose solution (obtained via a dynamic program) characterizes the optimal strategies for the agents. We also show that our solution can be easily modified to incorporate constraints on when and how frequently agents can communicate.

翻译：在最近文献中,不同代理人之间信息共享模式下的多试剂系统控制问题引起了人们的极大关注。在本文件中,我们考虑了一种设置,即代理人可以动态地在每一阶段决定是否彼此分享信息并由此产生通信成本,而不是承诺采用固定的信息共享协议(例如定期共享或不共享等),代理人可以动态地决定是否相互交流信息,这种设置需要联合设计代理人的通信和控制战略,以优化通信成本和控制目标之间的平衡。我们首先表明,代理人可以忽视其私人信息的一大部分而不损害系统性能。我们然后为战略优化问题提供一个基于共同的信息方法。这一方法依赖于建立一个虚构的POMDP,其解决方案(通过动态程序)是代理人的最佳战略的特征。我们还表明,我们的解决办法可以很容易地修改,以纳入对代理人何时和多久能够进行通信的限制。