尽量减少运动会和其他运动会中的剥削 (Exploitability Minimization in Games and Beyond)

Pseudo-games are a natural and well-known generalization of normal-form games, in which the actions taken by each player affect not only the other players' payoffs, as in games, but also the other players' strategy sets. The solution concept par excellence for pseudo-games is the generalized Nash equilibrium (GNE), i.e., a strategy profile at which each player's strategy is feasible and no player can improve their payoffs by unilaterally deviating to another strategy in the strategy set determined by the other players' strategies. The computation of GNE in pseudo-games has long been a problem of interest, due to applications in a wide variety of fields, from environmental protection to logistics to telecommunications. Although computing GNE is PPAD-hard in general, it is still of interest to try to compute them in restricted classes of pseudo-games. One approach is to search for a strategy profile that minimizes exploitability, i.e., the sum of the regrets across all players. As exploitability is nondifferentiable in general, developing efficient first-order methods that minimize it might not seem possible at first glance. We observe, however, that the exploitability-minimization problem can be recast as a min-max optimization problem, and thereby obtain polynomial-time first-order methods to compute a refinement of GNE, namely the variational equilibria (VE), in convex-concave cumulative regret pseudo-games with jointly convex constraints. More generally, we also show that our methods find the stationary points of the exploitability in polynomial time in Lipschitz-smooth pseudo-games with jointly convex constraints. Finally, we demonstrate in experiments that our methods not only outperform known algorithms, but that even in pseudo-games where they are not guaranteed to converge to a GNE, they may do so nonetheless, with proper initialization.

翻译：棋类游戏是一种自然的、众所周知的普通游戏,在这个游戏中,每个玩家采取的行动不仅影响其他玩家的回报,例如游戏,而且影响其他玩家的战略组合。伪游戏的解决方案概念优于普通纳什平衡(GNE),即每个玩家的战略都是可行的,没有一个玩家可以通过单方面偏离其他玩家战略确定的战略中的另一个战略来改善他们的回报。在假游戏中,计算GNE的行为长期以来是一个令人感兴趣的问题,因为从环境保护到物流等多种领域的应用,不仅影响其他玩家的回报。虽然计算GNE是普通的PPAD-G游戏的硬性概念。一种办法是寻找一种战略配置,将玩家的战略战略战略战略的可发挥性最小化,也就是说,所有玩家的遗憾总和性。一般来说,在模拟游戏中,开发有效的第一阶级方法,将它从环境保护到物流到电信。虽然计算GNENEO的精确度是硬性规则,但一般地在最先看,我们最先看,但最先看,我们最接近于最接近于最接近最接近的变的变的GAM。