Decentralized team problems where players have asymmetric information about the state of the underlying stochastic system have been actively studied, but games between such teams are less understood. We consider a general model of zero-sum stochastic games between two competing teams. This model subsumes many previously considered team and zero-sum game models. For this general model, we provide bounds on the upper (min-max) and lower (max-min) values of the game. Furthermore, if the upper and lower values of the game are identical (i.e., if the game has a value), our bounds coincide with the value of the game. Our bounds are obtained using two dynamic programs based on a sufficient statistic known as the common information belief (CIB). We also identify certain information structures in which only the minimizing team controls the evolution of the CIB. In these cases, we show that one of our CIB based dynamic programs can be used to find the min-max strategy (in addition to the min-max value). We propose an approximate dynamic programming approach for computing the values (and the strategy when applicable) and illustrate our results with the help of an example.
翻译:在球员对基本随机系统状态有不对称信息的情况下, 球员的分散式团队问题得到了积极的研究, 但是这些球队之间的游戏却不那么为人所知。 我们考虑的是两个相互竞争的球队之间零和随机游戏的一般模式。 这个模型包含许多以前考虑过的球队和零和游戏模式。 对于这个一般模型, 我们提供游戏上( 负负) 和下( 负负) 值的界限。 此外, 如果游戏的上值和下值是相同的( 即, 如果游戏有价值的话), 我们的界限与游戏的值相吻合。 我们的界限是使用两个动态程序获得的。 基于一个被称为共同信息信念( CIB) 的足够统计数据( CIB) 。 我们还确定了某些信息结构, 只有最小化队控制 CIB 的演变。 在这些情况下, 我们展示了我们基于 CIB 的动态程序之一可以用来找到最小质量战略( 除微负值值之外) 。 我们建议了一种大致动态的编程方法来计算数值( 和在适用时的策略) 并用一个例子来说明我们的结果 。