贝叶斯相关均衡和无悔动态 (Bayes correlated equilibria and no-regret dynamics)

This paper explores equilibrium concepts for Bayesian games, which are fundamental models of games with incomplete information. We aim at three desirable properties of equilibria. First, equilibria can be naturally realized by introducing a mediator into games. Second, an equilibrium can be computed efficiently in a distributed fashion. Third, any equilibrium in that class approximately maximizes social welfare, as measured by the price of anarchy, for a broad class of games. These three properties allow players to compute an equilibrium and realize it via a mediator, thereby settling into a stable state with approximately optimal social welfare. Our main result is the existence of an equilibrium concept that satisfies these three properties. Toward this goal, we characterize various (non-equivalent) extensions of correlated equilibria, collectively known as Bayes correlated equilibria. In particular, we focus on communication equilibria (also known as coordination mechanisms), which can be realized by a mediator who gathers each player's private information and then sends correlated recommendations to the players. We show that if each player minimizes a variant of regret called untruthful swap regret in repeated play of Bayesian games, the empirical distribution of these dynamics converges to a communication equilibrium. We present an efficient algorithm for minimizing untruthful swap regret with a sublinear upper bound, which we prove to be tight up to a multiplicative constant. As a result, by simulating the dynamics with our algorithm, we can efficiently compute an approximate communication equilibrium. Furthermore, we extend existing lower bounds on the price of anarchy based on the smoothness arguments from Bayes Nash equilibria to equilibria obtained by the proposed dynamics.

翻译：本文探讨贝叶斯博弈的均衡概念，这是不完全信息博弈的基本模型。我们旨在实现三个期望的均衡特性。首先，均衡自然可以通过引入协调者实现。其次，均衡可以被分布式地高效地计算。第三，在一个广泛的游戏类别中，任何均衡大致上都最大化社会福利，作为公理收益率的衡量标准。这三个属性允许玩家计算均衡并通过协调者实现它，从而进入一个稳定状态，具有近似最优的社会福利。我们的主要结果是存在一种均衡概念，满足这三个属性。为此，我们刻画了各种（不等价的）相关均衡扩展，统称为贝叶斯相关均衡。特别是，我们关注通信均衡（也称为协调机制），可以通过收集每个玩家的个人信息，然后向玩家发送相关的建议，使之实现。我们表明，如果每个玩家在重复博弈时最小化一种称为不诚实交换后悔的遗憾变化，这些动态的经验分布趋向于通信均衡。我们提出了一种最小化不诚实交换后悔的有效算法，其上限是次线性的，我们证明了其上限是紧凑的，到乘法常数。因此，通过使用我们的算法模拟这些动态，我们可以有效地计算近似的通信均衡。此外，我们将现有的关于基于贝叶斯-纳什均衡的光滑性论据的公理收益率下限扩展到了该文提出的动态的均衡。