We consider the problem of estimating preferences of human agents from data of strategic systems where the agents repeatedly interact. Recently, it was demonstrated that a new estimation method called "quantal regret" produces more accurate estimates for human agents than the classic approach that assumes that agents are rational and reach a Nash equilibrium; however, this method has not been compared to methods that take into account behavioral aspects of human play. In this paper we leverage equilibrium concepts from behavioral economics for this purpose and ask how well they perform compared to the quantal regret and Nash equilibrium methods. We develop four estimation methods based on established behavioral equilibrium models to infer the utilities of human agents from observed data of normal-form games. The equilibrium models we study are quantal-response equilibrium, action-sampling equilibrium, payoff-sampling equilibrium, and impulse-balance equilibrium. We show that in some of these concepts the inference is achieved analytically via closed formulas, while in the others the inference is achieved only algorithmically. We use experimental data of 2x2 games to evaluate the estimation success of these behavioral equilibrium methods. The results show that the estimates they produce are more accurate than the estimates of the Nash equilibrium. The comparison with the quantal-regret method shows that the behavioral methods have better hit rates, but the quantal-regret method performs better in terms of the overall mean squared error, and we discuss the differences between the methods.
翻译:我们考虑了从代理人反复互动的战略系统数据中估算人类代理人偏好的问题。最近,有证据表明,一种称为“横向遗憾”的新估计方法对人体代理人的预测比假定物剂是理性的并达到纳什平衡的经典方法更准确;然而,我们没有将这种方法与考虑到人类游戏行为方面的方法进行比较。在本文件中,我们利用行为经济学中平衡概念来利用该目的,并询问它们与孔氏遗憾和纳什平衡方法相比的表现如何。我们根据既定的行为平衡模型开发了四种估计方法,从正常形式游戏的观测数据中推断人类代理人的效用。我们研究的平衡模型是量-反应平衡、行动抽样平衡、报酬抽样平衡以及冲动平衡。我们表明,在其中一些概念中,通过封闭公式分析得出平衡概念,而在其他概念中,只能得出逻辑推论的推论。我们使用2x2游戏的实验数据来评估这些行为平衡方法的预期成功程度。我们研究的平衡模型显示,它们得出的估计值比平方方法的比方平方方法的推算法要更准确。