The usage of automated learning agents is becoming increasingly prevalent in many online economic applications such as online auctions and automated trading. Motivated by such applications, this paper is dedicated to fundamental modeling and analysis of the strategic situations that the users of automated learning agents are facing. We consider strategic settings where several users engage in a repeated online interaction, assisted by regret-minimizing learning agents that repeatedly play a "game" on their behalf. We propose to view the outcomes of the agents' dynamics as inducing a "meta-game" between the users. Our main focus is on whether users can benefit in this meta-game from "manipulating" their own agents by misreporting their parameters to them. We define a general framework to model and analyze these strategic interactions between users of learning agents for general games and analyze the equilibria induced between the users in three classes of games. We show that, generally, users have incentives to misreport their parameters to their own agents, and that such strategic user behavior can lead to very different outcomes than those anticipated by standard analysis.
翻译:在网上拍卖和自动交易等许多在线经济应用中,自动学习代理的使用日益普遍。在这种应用的推动下,本文件致力于对自动化学习代理用户所面临的战略形势进行基本建模和分析。我们考虑一些用户在代表他们反复玩“游戏”的遗憾最小化学习代理商的协助下,反复进行在线互动的战略环境。我们提议将自动学习代理商的动态结果视为引发用户之间的“元游戏”。我们的主要关注点是用户能否从这一“操纵”自己的代理商的元游戏中获得好处,如错误地向它们报告参数。我们定义了一个用于模拟和分析普通游戏学习代理商用户之间这些战略互动的总框架,并分析三大类游戏用户之间引发的平衡。我们表明,一般来说,用户有动机错误地向自己的代理商报告其参数,而这种战略用户的行为可能导致与标准分析预期的结果大不相同的结果。