Finding the mixed Nash equilibria (MNE) of a two-player zero sum continuous game is an important and challenging problem in machine learning. A canonical algorithm to finding the MNE is the noisy gradient descent ascent method which in the infinite particle limit gives rise to the {\em Mean-Field Gradient Descent Ascent} (GDA) dynamics on the space of probability measures. In this paper, we first study the convergence of a two-scale Mean-Field GDA dynamics for finding the MNE of the entropy-regularized objective. More precisely we show that for any fixed positive temperature (or regularization parameter), the two-scale Mean-Field GDA with a {\em finite} scale ratio converges to exponentially to the unique MNE without assuming the convexity or concavity of the interaction potential. The key ingredient of our proof lies in the construction of new Lyapunov functions that dissipate exponentially along the Mean-Field GDA. We further study the simulated annealing of the Mean-Field GDA dynamics. We show that with a temperature schedule that decays logarithmically in time the annealed Mean-Field GDA converges to the MNE of the original unregularized objective function.
翻译:在机器学习中,寻找双玩家零和连续游戏的混合纳什平衡(NNE)是一个重要且具有挑战性的问题。找到MNE的卡通算法是振动的梯度下降率上升率方法,在无限粒子极限中,该方法导致在概率测量空间上出现“平偏梯梯子源子加速度”(GDA)动态。在本文中,我们首先研究双尺度中正平方位GDA动态集成(MNE ), 以寻找正正向(或正规参数) 。更准确地说,我们显示,对于任何固定的正向温度(或正规化参数), 双尺度的GDA(S) 比例为“ 平方平方平方平方平面GDA”, 在不假定相互作用潜力的共和度( GDA) 时, 我们的证据的关键部分在于构建新的Lyapunov 函数, 沿着平方平方平方平面的GDA 。我们进一步研究了模拟的平面GDA动态的模拟内线。我们用一个温度表显示, 将原始的正态正向正态平整的GDA的正方平流函数同步同步的正向正轨。