Decentralized team problems where players have asymmetric information about the state of the underlying stochastic system have been actively studied, but \emph{games} between such teams are less understood. We consider a general model of zero-sum stochastic games between two competing teams. This model subsumes many previously considered team and zero-sum game models. For this general model, we provide bounds on the upper (min-max) and lower (max-min) values of the game. Furthermore, if the upper and lower values of the game are identical (i.e., if the game has a \emph{value}), our bounds coincide with the value of the game. Our bounds are obtained using two dynamic programs based on a sufficient statistic known as the common information belief (CIB). We also identify certain information structures in which only the minimizing team controls the evolution of the CIB. In these cases, we show that one of our CIB based dynamic programs can be used to find the min-max strategy (in addition to the min-max value). We propose an approximate dynamic programming approach for computing the values (and the strategy when applicable) and illustrate our results with the help of an example.
翻译:当玩家对基本随机系统状态有不对称信息时, 却对球员对球员的分权小组问题进行了积极研究, 但对于这些球员之间对基本随机系统状态有不对称信息时, 球员对球员的分权小组问题却不太了解 。 我们考虑的是两个相互竞争的球队之间零和随机游戏的一般模式。 这个模型包含许多以前考虑过的球员和零和游戏模式。 对于这个通用模型, 我们提供游戏上( 最小) 和下( 最大) 值的分权小组问题。 此外, 如果游戏的上值和下值相同( 即, 如果游戏有 微负值 ), 我们的界限与游戏值相吻合。 我们的界限是使用两个动态程序获得的, 其基於一个被称为共同信息信念( CIB) 的足够的统计数据。 我们还确定了某些信息结构, 只有最小化小组控制游戏的进化。 在这些情况下, 我们显示我们的 CIB 的动态程序之一可以用来找到最小- Max 战略( 除最小值值值值值外) 。 我们建议一种粗略的动态编程方法, 和帮助计算结果的示例。