In this paper, we introduce a new representation for team-coordinated game-theoretic decision making, which we coin team belief DAG form. In our representation, at every timestep, a team coordinator observes the information that is public to all its members, and then decides on a prescription for all the possible states consistent with its observations. Our representation unifies and extends recent approaches to team coordination. Similar to the approach of Carminati et al (2021), our team belief DAG form can be used to capture adversarial team games, and enables standard, out-of-the-box game-theoretic techniques including no-regret learning (e.g., CFR and its state-of-the-art modern variants such as DCFR and PCFR+) and first-order methods. However, our representation can be exponentially smaller, and can be viewed as a lossless abstraction of theirs into a directed acyclic graph. In particular, like the LP-based algorithm of Zhang & Sandholm (2022), the size of our representation scales with the amount of information uncommon to the team; in fact, using linear programming on top of our team belief DAG form to solve for a team correlated equilibrium in an adversarial team games recovers almost exactly their algorithm. Unlike that paper, however, our representation explicitly exposes the structure of the decision space, which is what enables the aforementioned game-theoretic techniques.
翻译:在本文中,我们引入了团队协调的游戏理论决策的新代表,我们团队相信DAG的形式。在我们的代表中,每个时间步骤,小组协调员都会观察向所有成员公开的信息,然后根据观察结果决定对所有可能的国家的处方。我们的代表将团队协调的最新做法统一起来,并扩展了最近的方法。与Carminati等人(2021年)的做法相似,我们的团队信仰DAG形式可以用来捕捉敌对团队游戏,并能够使用标准、箱外的游戏理论技术,包括无雷学习(例如CFR及其最先进的现代变体,如DCFR和PCFR+)和一级方法。然而,我们的代表性可以大大缩小,并可以被视为对团队协调的循环图的无损抽象。特别是,像张和桑德霍尔姆(2022年)基于LP的算法一样,我们的代表规模与团队不熟悉的信息数量(例如CFRFR及其最先进的现代变体,如DCFR和PCFR++)以及一级方法。然而,我们的代表可以使用直线性演算法的团队最上面的团队的信念规模。