We study learning dynamics induced by strategic agents who repeatedly play a game with an unknown payoff-relevant parameter. In this dynamics, a belief estimate of the parameter is repeatedly updated given players' strategies and realized payoffs using Bayes's rule. Players adjust their strategies by accounting for best response strategies given the belief. We show that, with probability 1, beliefs and strategies converge to a fixed point, where the belief consistently estimates the payoff distribution for the strategy, and the strategy is an equilibrium corresponding to the belief. However, learning may not always identify the unknown parameter because the belief estimate relies on the game outcomes that are endogenously generated by players' strategies. We obtain sufficient and necessary conditions, under which learning leads to a globally stable fixed point that is a complete information Nash equilibrium. We also provide sufficient conditions that guarantee local stability of fixed point beliefs and strategies.
翻译:我们研究由反复玩游戏的战略代理人所引发的动态学,这些战略代理人反复玩出一个未知的得益相关参数。在这种动态学中,对参数的信念估计会根据球员的策略和运用贝耶斯规则实现的得益反复更新。球员调整其战略,考虑到最佳的响应策略和信念。我们显示,在概率1下,信念和战略会汇合到一个固定点,而信念会不断估计战略的得益分配,而战略是同信念相对应的平衡。但是,学习不一定总能确定未知参数,因为信念估计取决于球员策略本身产生的游戏结果。我们获得了足够和必要的条件,在这种条件下,学习可以导致一个全球稳定的固定点,即完全的信息纳什平衡。我们还提供了足够的条件,保证固定点信念和战略在本地的稳定。