Learning and equilibrium computation in games are fundamental problems across computer science and economics, with applications ranging from politics to machine learning. Much of the work in this area revolves around a simple algorithm termed \emph{randomized weighted majority} (RWM), also known as "Hedge" or "Multiplicative Weights Update," which is well known to achieve statistically optimal rates in adversarial settings (Littlestone and Warmuth '94, Freund and Schapire '99). Unfortunately, RWM comes with an inherent computational barrier: it requires maintaining and sampling from a distribution over all possible actions. In typical settings of interest the action space is exponentially large, seemingly rendering RWM useless in practice. In this work, we refute this notion for a broad variety of \emph{structured} games, showing it is possible to efficiently (approximately) sample the action space in RWM in \emph{polylogarithmic} time. This gives the first efficient no-regret algorithms for problems such as the \emph{(discrete) Colonel Blotto game}, \emph{matroid congestion}, \emph{matroid security}, and basic \emph{dueling games}. As an immediate corollary, we give a polylogarithmic time meta-algorithm to compute approximate Nash Equilibria for these games that is exponentially faster than prior methods in several important settings. Further, our algorithm is the first to efficiently compute equilibria for more involved variants of these games with general sums, more than two players, and, for Colonel Blotto, multiple resource types.
翻译:游戏中的学习和平衡计算是计算机科学和经济学中的根本问题,其应用范围从政治到机器学习。这一领域的许多工作围绕一个简单的算法,即 emph{randomized 加权多数} (RWM) (RWM) (RWM), 也称为“ 隐藏” 或“ 复制性 Weights 更新 ”, 众所周知, 可以在对抗性环境( Littalstone and Warmuth'94, Freund and Schaprireire '99) 中实现统计上的最佳率。 不幸的是, RWM 带来了一个内在的计算障碍: 它需要从所有可能的行动的分布上保持和取样。 在典型的兴趣环境中, 动作空间是惊人的, 似乎使RWMM在实际中变得毫无用处。 在这项工作中, 我们驳斥了这个概念是可能的( 大约) 将RWM matealge metal 的动作空间取样到 \ emph{polylogyralth} 时间。 这就使得我们的第一个高效的不易变数问题算算算法, 的计算方法比 的数值重要。