In the literature on game-theoretic equilibrium finding, focus has mainly been on solving a single game in isolation. In practice, however, strategic interactions -- ranging from routing problems to online advertising auctions -- evolve dynamically, thereby leading to many similar games to be solved. To address this gap, we introduce meta-learning for equilibrium finding and learning to play games. We establish the first meta-learning guarantees for a variety of fundamental and well-studied classes of games, including two-player zero-sum games, general-sum games, and Stackelberg games. In particular, we obtain rates of convergence to different game-theoretic equilibria that depend on natural notions of similarity between the sequence of games encountered, while at the same time recovering the known single-game guarantees when the sequence of games is arbitrary. Along the way, we prove a number of new results in the single-game regime through a simple and unified framework, which may be of independent interest. Finally, we evaluate our meta-learning algorithms on endgames faced by the poker agent Libratus against top human professionals. The experiments show that games with varying stack sizes can be solved significantly faster using our meta-learning techniques than by solving them separately, often by an order of magnitude.
翻译:在关于游戏理论平衡的文献中,重点主要是孤立地解决单一游戏。但在实践中,战略互动 -- -- 从路由问题到在线广告拍卖等 -- -- 动态地演变,从而导致许多类似的游戏需要解决。为了解决这一差距,我们引入了平衡寻找和学习游戏的元学习。我们为各种基本和研究周密的游戏类别,包括两玩家零和游戏、普通和游戏以及斯塔克尔伯格游戏,建立了第一个元学习保障。特别是,我们获得了与不同游戏-理论平衡的趋同率,这取决于所遇到游戏序列之间相似的自然概念,同时在游戏顺序任意时恢复已知的单游戏保障。此外,我们通过一个可能具有独立兴趣的简单和统一框架,证明单游戏制度中有一些新结果。最后,我们评估了我们关于扑克代理Libratus对顶级人类专业人员所面临的终局游戏的元学习算法。实验显示,以不同的堆码规模的游戏往往能够以更快的方式通过不同程度的元化技术解决它们。</s>