与铁线-手完美平衡的结合 (Last-iterate Convergence to Trembling-hand Perfect Equilibria)

Designing efficient algorithms to find Nash equilibrium (NE) refinements in sequential games is of paramount importance in practice. Indeed, it is well known that the NE has several weaknesses, since it may prescribe to play sub-optimal actions in those parts of the game that are never reached at the equilibrium. NE refinements, such as the extensive-form perfect equilibrium (EFPE), amend such weaknesses by accounting for the possibility of players' mistakes. This is crucial in real-world applications, where bounded rationality players are usually involved, and it turns out being useful also in boosting the performances of superhuman agents for recreational games like Poker. Nevertheless, only few works addressed the problem of computing NE refinements. Most of them propose algorithms finding exact NE refinements by means of linear programming, and, thus, these do not have the potential of scaling up to real-world-size games. On the other hand, existing iterative algorithms that exploit the tree structure of sequential games only provide convergence guarantees to approximate refinements. In this paper, we provide the first efficient last-iterate algorithm that provably converges to an EFPE in two-player zero-sum sequential games with imperfect information. Our algorithm works by tracking a sequence of equilibria of suitably-defined, regularized-perturbed games. In order to do that, it uses a procedure that is tailored to converge last-iterate to the equilibria of such games. Crucially, the updates performed by such a procedure can be performed efficiently by visiting the game tree, thus making our algorithm potentially more scalable than its linear-programming-based competitors. Finally, we evaluate our algorithm on a standard testbed of games, showing that it produces strategies which are much more robust to players' mistakes than those of state-of-the-art NE-computation algorithms.

翻译：设计高效的算法以寻找序列游戏中的纳什平衡( NE) 改进是实践中最重要的。事实上,众所周知, NE 有一些弱点,因为它可以规定在游戏中那些从未在平衡中达到的部位上玩亚最佳动作。 NE 的改进,例如广泛成形的完美平衡(EFPE),通过考虑玩家错误的可能性来修正这些弱点。这对于现实世界应用程序至关重要,在现实世界应用程序中,受约束的理性玩家通常都参与其中,而且这在提升像Poker这样的娱乐游戏超人代理器的性能方面也很有用。然而,只有很少的作品可以解决计算NEE的精度改进问题。它们大多建议通过线性编程程序找到精确的NE的精度改进的算法,因此,它们没有潜力提升到真实世界规模游戏的可能性。另一方面,利用现有的迭代算法,利用按顺序游戏的树结构结构只能保证接近精细的精细的精度。在本文中,我们第一次高效的最后一度算算法将它与EPE级游戏的精度调整过程相交汇。因此,通过我们定的精度的精度演的精度演的精度将最后的精度演的精度演程序,我们最精确的精细的精细的精细的精细的精细的精细的精细的精细的精细的算法程序, 。