We study the application of iterative first-order methods to the problem of computing equilibria of large-scale two-player extensive-form games. First-order methods must typically be instantiated with a regularizer that serves as a distance-generating function for the decision sets of the players. For the case of two-player zero-sum games, the state-of-the-art theoretical convergence rate for Nash equilibrium is achieved by using the dilated entropy function. In this paper, we introduce a new entropy-based distance-generating function for two-player zero-sum games, and show that this function achieves significantly better strong convexity properties than the dilated entropy, while maintaining the same easily-implemented closed-form proximal mapping. Extensive numerical simulations show that these superior theoretical properties translate into better numerical performance as well. We then generalize our new entropy distance function, as well as general dilated distance functions, to the scaled extension operator. The scaled extension operator is a way to recursively construct convex sets, which generalizes the decision polytope of extensive-form games, as well as the convex polytopes corresponding to correlated and team equilibria. By instantiating first-order methods with our regularizers, we develop the first accelerated first-order methods for computing correlated equilibra and ex-ante coordinated team equilibria. Our methods have a guaranteed $1/T$ rate of convergence, along with linear-time proximal updates.
翻译:我们研究如何应用迭代一阶方法来计算大型双玩者大组合游戏的平衡问题。 第一阶方法通常必须用一个常规化器来即时操作,该常规化器可以作为玩家决策组的远程生成功能。对于两个玩家零和游戏来说,Nash均衡的最先进的理论趋同率是通过使用扩展的 entropy 函数来实现的。在本文中,我们为两个玩家零和游戏引入一个新的基于银河的远程生成功能,并显示该功能比变异的变异器具有更强的直线性能,同时保持同样容易执行的闭式正式准度映射功能。广泛的数字模拟表明,这些超高级理论性能会转化为更好的数字性能。我们然后将我们新的变异性远程功能以及一般变异性远程功能推广到扩大的扩展操作员。扩大的扩展操作员更新功能是反复构建调制的调和变异性调的调器,这个功能比变异性变异性性性性能性能性能要大大优得多,同时保持相同的封闭式缩缩缩缩缩缩缩缩缩缩图。