We consider the problem of computing mixed Nash equilibria of two-player zero-sum games with continuous sets of pure strategies and with first-order access to the payoff function. This problem arises for example in game-theory-inspired machine learning applications, such as distributionally-robust learning. In those applications, the strategy sets are high-dimensional and thus methods based on discretisation cannot tractably return high-accuracy solutions. In this paper, we introduce and analyze a particle-based method that enjoys guaranteed local convergence for this problem. This method consists in parametrizing the mixed strategies as atomic measures and applying proximal point updates to both the atoms' weights and positions. It can be interpreted as a time-implicit discretization of the "interacting" Wasserstein-Fisher-Rao gradient flow. We prove that, under non-degeneracy assumptions, this method converges at an exponential rate to the exact mixed Nash equilibrium from any initialization satisfying a natural notion of closeness to optimality. We illustrate our results with numerical experiments and discuss applications to max-margin and distributionally-robust classification using two-layer neural networks, where our method has a natural interpretation as a simultaneous training of the network's weights and of the adversarial distribution.
翻译:本文研究了具有连续纯策略集和对收益函数的一阶访问的双人零和博弈的混合纳什均衡计算问题。该问题在博弈论启发的机器学习应用中特别常见,例如分布鲁棒学习。在这些应用中,策略集是高维的, 因此基于离散化的方法不能够返回高准确度的解。在本文中,我们引入并分析了一种基于粒子的方法,该方法对于这个问题具有有保证的局部收敛性。这种方法将混合策略参数化为原子测度,并对原子的权重和位置应用近端点更新。它可以解释为"相互作用" Wasserstein-Fisher-Rao 梯度流的时隐离散化。我们证明,在非退化性假设下,该方法对于任何满足自然优化的近似度的初始化,从任何初始化中指数收敛到精确的混合纳什均衡。我们通过数值实验来说明我们的结果,并讨论了使用双层神经网络的极大间隔和分布鲁棒分类的应用,其中我们的方法有一个自然的解释作为网络权重和敌对分布的同时训练。