While Nash equilibrium has emerged as the central game-theoretic solution concept, many important games contain several Nash equilibria and we must determine how to select between them in order to create real strategic agents. Several Nash equilibrium refinement concepts have been proposed and studied for sequential imperfect-information games, the most prominent being trembling-hand perfect equilibrium, quasi-perfect equilibrium, and recently one-sided quasi-perfect equilibrium. These concepts are robust to certain arbitrarily small mistakes, and are guaranteed to always exist; however, we argue that neither of these is the correct concept for developing strong agents in sequential games of imperfect information. We define a new equilibrium refinement concept for extensive-form games called observable perfect equilibrium in which the solution is robust over trembles in publicly-observable action probabilities (not necessarily over all action probabilities that may not be observable by opposing players). Observable perfect equilibrium correctly captures the assumption that the opponent is playing as rationally as possible given mistakes that have been observed (while previous solution concepts do not). We prove that observable perfect equilibrium is always guaranteed to exist, and demonstrate that it leads to a different solution than the prior extensive-form refinements in no-limit poker. We expect observable perfect equilibrium to be a useful equilibrium refinement concept for modeling many important imperfect-information games of interest in artificial intelligence.
翻译:尽管纳什均衡已成为中心博弈论解概念,但许多重要游戏包含多个纳什均衡,我们必须确定如何在它们之间进行选择,以创建真正的战略代理。已经提出和研究了几个纳什均衡精化概念,其中最重要的是颤抖手完美均衡、拟完美均衡和最近的单侧拟完美均衡。这些概念对某些任意小的错误是健壮的,并且保证始终存在;然而,我们认为这些概念都不是在具有连续不完美信息的顺序游戏中开发强大代理的正确概念。我们为广泛形式的游戏定义了一个新的均衡精化概念,称为可观察完美均衡,在这种解中,解决方案对于在公开可观察的行动概率中的颤抖是健壮的(不一定对于所有对手不能观察到的行动概率是健壮的)。可观察完美均衡正确地捕捉了对手在观察到的错误下尽可能理性地玩游戏的假设(而之前的解概念没有)。我们证明了可观察完美均衡总是保证存在,并且证明了它导致一个不同于以前无限扑克游戏的广泛形式精化的解。我们期望可观察完美均衡成为建模人工智能中许多重要不完美信息游戏的有用均衡精化概念。