This work considers a novel information design problem and studies how the craft of payoff-relevant environmental signals solely can influence the behaviors of intelligent agents. The agents' strategic interactions are captured by a Markov game, in which each agent first selects one external signal from multiple signal sources as additional payoff-relevant information and then takes an action. There is a rational information designer (principal) who possesses one signal source and aims to influence the equilibrium behaviors of the agents by designing the information structure of her signals sent to the agents. We propose a direct information design approach that incentivizes each agent to select the signal sent by the principal, such that the design process avoids the predictions of the agents' strategic selection behaviors. We then introduce the design protocol given a goal of the designer which we refer to as obedient implementability (OIL) and characterize the OIL in a class of obedient sequential Markov perfect equilibria (O-SMPE). A design regime is proposed based on an approach which we refer to as the fixed-point alignment that incentivizes the agents to choose the signal sent by the principal, guarantees that the agents' policy profile of taking actions is the policy component of an O-SMPE and the principal's goal is achieved. We then formulate the principal's optimal goal selection problem in terms of information design and characterize the optimization problem by minimizing the fixed-point misalignments. The proposed approach can be applied to elicit desired behaviors of multi-agent systems in competing as well as cooperating settings and be extended to heterogeneous stochastic games in the complete- and the incomplete-information environments.
翻译:这项工作考虑到一个新的信息设计问题,并研究支付相关环境信号的手法如何仅能影响智能剂的行为。代理商的战略互动被Markov游戏所捕捉,在这个游戏中,每个代理商首先从多个信号源中选择一个外部信号作为额外的支付相关信息,然后采取行动。有一个理性的信息设计师(Principal)拥有一个信号源,目的是通过设计其发送给代理商的信号的信息结构来影响代理商的平衡行为。我们建议一种直接的信息设计方法,鼓励每个代理商选择委托人发送的信号,这样设计过程可以避免对代理商战略选择行为的预测。然后,我们引入设计协议,给设计师设定一个目标,我们称之为服从性执行性(OIL),并将OIL描述成一个服从性序列的马尔科夫完美电子校准(O-SPE)。我们提出的设计制度是基于一种方法,即固定点调整方法,即鼓励代理商选择本公司发送的信号,这样设计过程可以避免对代理人的战略选择延长其战略选择行为选择行为的预测。我们所说的设计协议中,在OLLA中作为最佳选择目标的精度选择目标的精度的精度,我们提出的精度政策选择的精度,在O- 的精度选择的精度的精度选择中,在OPE的精度的精度的精度选择中,在最佳的精度选择中,我们的精度的精度的精度的精度的精度的精度的精度的精度政策中,也就是的精度政策中,也就是的精度是作为最佳的精度选择的精度选择的精度选择的精度选择的精度选择的精度选择的精度的精度的精度选择的精度的精度的精度的精度的精度的精度是的精度的精度选择的精度选择的精度。