The study of learning in games has thus far focused primarily on normal form games. In contrast, our understanding of learning in extensive form games (EFGs) and particularly in EFGs with many agents lags far behind, despite them being closer in nature to many real world applications. We consider the natural class of Network Zero-Sum Extensive Form Games, which combines the global zero-sum property of agent payoffs, the efficient representation of graphical games as well the expressive power of EFGs. We examine the convergence properties of Optimistic Gradient Ascent (OGA) in these games. We prove that the time-average behavior of such online learning dynamics exhibits $O(1/T)$ rate convergence to the set of Nash Equilibria. Moreover, we show that the day-to-day behavior also converges to Nash with rate $O(c^{-t})$ for some game-dependent constant $c>0$.
翻译:迄今为止,在游戏中学习的学习主要集中在普通形式游戏上。相比之下,我们对广泛形式的游戏(EFGs),特别是许多代理商的EFGs的学习理解远远落后于许多代理商,尽管它们的性质与许多现实世界应用相近。我们认为网络零-苏姆广泛形式游戏的自然等级,它结合了代理商报酬的全球零和属性、图形游戏的有效表现以及EFGs的表达力。我们研究了这些游戏中乐观性渐渐增奖牌(OGA)的趋同特性。我们证明,这种在线学习动力学的时平均行为显示了与Nash Equilibria的趋同率。此外,我们表明,对于某些以游戏为依存的恒定值美元($-c-t),日常行为也与Nash的汇率($O)和$(c-t)相趋同。