While Nash equilibrium has emerged as the central game-theoretic solution concept, many important games contain several Nash equilibria and we must determine how to select between them in order to create real strategic agents. Several Nash equilibrium refinement concepts have been proposed and studied for sequential imperfect-information games, the most prominent being trembling-hand perfect equilibrium, quasi-perfect equilibrium, and recently one-sided quasi-perfect equilibrium. These concepts are robust to certain arbitrarily small mistakes, and are guaranteed to always exist; however, we argue that neither of these is the correct concept for developing strong agents in sequential games of imperfect information. We define a new equilibrium refinement concept for extensive-form games called observable perfect equilibrium in which the solution is robust over trembles in publicly-observable action probabilities (not necessarily over all action probabilities that may not be observable by opposing players). Observable perfect equilibrium correctly captures the assumption that the opponent is playing as rationally as possible given mistakes that have been observed (while previous solution concepts do not). We prove that observable perfect equilibrium is always guaranteed to exist, and demonstrate that it leads to a different solution than the prior extensive-form refinements in no-limit poker. We expect observable perfect equilibrium to be a useful equilibrium refinement concept for modeling many important imperfect-information games of interest in artificial intelligence.
翻译:虽然纳什平衡已成为核心的游戏理论-理论解决方案概念,但许多重要的游戏都含有若干纳什平衡概念,我们必须决定如何在它们之间作出选择,以创造真正的战略媒介。一些纳什平衡完善概念已经提出,并研究用于连续不完善的信息游戏,最突出的是颤抖的完美平衡、准完美平衡和最近出现的片面准完美平衡。这些概念对某些任意的小错误是强有力的,并且保证永远存在;然而,我们认为,这两个概念都不是在连续的不完善信息游戏中培养强力代理人的正确概念。我们为广泛形式的游戏确定了一种新的平衡完善概念,即所谓的可见完美平衡,即解决方案在公众可观察的行动概率中强过战(不一定超过所有行动概率,而可能不会被对手所察觉 ) 。 观察的完美平衡正确地抓住了这样的假设,即对手所玩的理性与所观察到的错误一样(尽管先前的解决方案概念没有被证实 ) 。 我们证明,在不完善的完美平衡中,我们总是有保证存在,并且证明它导致一种比以往的不完善的完美规则更精确的完美规则。