广泛形式相关平衡的无区域学习动态 (No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium)

The existence of simple, uncoupled no-regret dynamics that converge to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems. Specifically, it has been known for more than 20 years that when all players seek to minimize their internal regret in a repeated normal-form game, the empirical frequency of play converges to a normal-form correlated equilibrium. Extensive-form (that is, tree-form) games generalize normal-form games by modeling both sequential and simultaneous moves, as well as private information. Because of the sequential nature and presence of partial information in the game, extensive-form correlation has significantly different properties than the normal-form counterpart, many of which are still open research directions. Extensive-form correlated equilibrium (EFCE) has been proposed as the natural extensive-form counterpart to normal-form correlated equilibrium. However, it was currently unknown whether EFCE emerges as the result of uncoupled agent dynamics. In this paper, we give the first uncoupled no-regret dynamics that converge to the set of EFCEs in $n$-player general-sum extensive-form games with perfect recall. First, we introduce a notion of trigger regret in extensive-form games, which extends that of internal regret in normal-form games. When each player has low trigger regret, the empirical frequency of play is close to an EFCE. Then, we give an efficient no-trigger-regret algorithm. Our algorithm decomposes trigger regret into local subproblems at each decision point for the player, and constructs a global strategy of the player from the local solutions at each decision point.

翻译：简单的、未相互校正的、不折不扣的不折不扣的动态的存在,在正常形式游戏中与正式游戏中相交的平衡是多试剂系统理论中一个值得庆贺的结果。具体地说,20多年来人们都知道,当所有玩家在重复的正式游戏中试图尽量减少内部遗憾时,玩耍的经验频率会与正式关联的平衡相交。广泛的组合(即树形)游戏(即树形)游戏通过模拟顺序和同时动作以及私人信息,将正式游戏的常规游戏普遍化为普通游戏。由于游戏中的顺序性质和部分信息的存在,广泛的组合关联性与正态对口方有显著的不同属性,其中许多仍然是开放的研究方向。广泛组合的关联性平衡(EFCEFCE)被提议为自然的宽式组合对应方与正式关联性相关平衡。然而,目前尚不清楚的是,广泛的组合(EFEFCE)是否通过不相交错的代理动态动态,我们第一次互译不折不折不折不相交的次动态动态,我们开始开始的游戏,而开始开始一个快速的游戏。