Computational game theory has many applications in the modern world in both adversarial situations and the optimization of social good. While there exist many algorithms for computing solutions in two-player interactions, finding optimal strategies in multiplayer interactions efficiently remains an open challenge. This paper focuses on computing the multiplayer Team-Maxmin Equilibrium with Coordination device (TMECor) in zero-sum extensive-form games. TMECor models scenarios when a team of players coordinates ex ante against an adversary. Such situations can be found in card games (e.g., in Bridge and Poker), when a team works together to beat a target player but communication is prohibited; and also in real world, e.g., in forest-protection operations, when coordinated groups have limited contact during interdicting illegal loggers. The existing algorithms struggle to find a TMECor efficiently because of their high computational costs. To compute a TMECor in larger games, we make the following key contributions: (1) we propose a hybrid-form strategy representation for the team, which preserves the set of equilibria; (2) we introduce a column-generation algorithm with a guaranteed finite-time convergence in the infinite strategy space based on a novel best-response oracle; (3) we develop an associated-representation technique for the exact representation of the multilinear terms in the best-response oracle; and (4) we experimentally show that our algorithm is several orders of magnitude faster than prior state-of-the-art algorithms in large games.
翻译:计算游戏理论在现代世界中有许多应用, 包括对抗性游戏和优化社会公益。 虽然在两个玩家互动中有许多计算解决方案的算法, 但找到多玩者互动的最佳战略仍然是一项公开的挑战。 本文侧重于在零和广泛形式的游戏中计算多玩者团队- Maxmin Equilium 与协调设备( TMECor) 之间的平衡。 当一个玩家团队对一个对手进行前期协调时, TMECor 模型的情景在较快的游戏中具有许多应用性。 在纸牌游戏中( 例如, 在Bridge和Poker), 当一个团队共同努力击败一个目标玩家, 但通信却被禁止; 在现实世界中, 比如在森林保护行动中, 当协调的团体在拦截非法日志时接触有限。 现有的算法由于它们计算成本高, 很难找到一个TMECor 。 为了在更大的游戏中计算出一个TMECor, 我们做出以下关键贡献:(1) 我们提议了一个混合的策略代表状态, 以保持一个松动式的组合; (2) 我们提出一个以最接近的算式的货币代表, 以我们最有保证的策略 以最有保证的 的极的 的 的 的 直压 。