In many real-world settings agents engage in strategic interactions with multiple opposing agents who can employ a wide variety of strategies. The standard approach for designing agents for such settings is to compute or approximate a relevant game-theoretic solution concept such as Nash equilibrium and then follow the prescribed strategy. However, such a strategy ignores any observations of opponents' play, which may indicate shortcomings that can be exploited. We present an approach for opponent modeling in multiplayer imperfect-information games where we collect observations of opponents' play through repeated interactions. We run experiments against a wide variety of real opponents and exact Nash equilibrium strategies in three-player Kuhn poker and show that our algorithm significantly outperforms all of the agents, including the exact Nash equilibrium strategies.
翻译:在许多真实世界环境中,代理商与多种对立代理商进行战略互动,这些代理商可以采用各种各样的战略。设计这种设置的代理商的标准方法是计算或估计一个相关的游戏理论解决方案概念,如纳什均衡,然后遵循规定的战略。然而,这种战略忽视了对对手的游戏的任何观察,这可能表明可以加以利用的缺点。我们提出了一个在多玩者不完善的信息游戏中进行对手模拟的方法,我们通过反复的互动来收集对对手的游戏的观察。我们实验的是各种各样的真正的对手和三位玩家Kuhn 扑克的精确的纳什平衡战略,并表明我们的算法大大优于所有代理商,包括精确的纳什平衡战略。