To regulate a social system comprised of self-interested agents, economic incentives are often required to induce a desirable outcome. This incentive design problem naturally possesses a bilevel structure, in which a designer modifies the rewards of the agents with incentives while anticipating the response of the agents, who play a non-cooperative game that converges to an equilibrium. The existing bilevel optimization algorithms raise a dilemma when applied to this problem: anticipating how incentives affect the agents at equilibrium requires solving the equilibrium problem repeatedly, which is computationally inefficient; bypassing the time-consuming step of equilibrium-finding can reduce the computational cost, but may lead the designer to a sub-optimal solution. To address such a dilemma, we propose a method that tackles the designer's and agents' problems simultaneously in a single loop. Specifically, at each iteration, both the designer and the agents only move one step. Nevertheless, we allow the designer to gradually learn the overall influence of the incentives on the agents, which guarantees optimality after convergence. The convergence rate of the proposed scheme is also established for a broad class of games.
翻译:为了规范由自我利益因素组成的社会制度,往往需要经济激励来促成一个理想的结果。这种激励设计问题自然具有双层结构,在这种结构中,设计者用激励来修改代理者的奖励,同时预测代理者的反应,他们玩的是一种不合作的游戏,这种游戏与平衡一致。现有的双层优化算法在应用到这个问题时会产生一个两难境地:预期奖励如何影响平衡的代理者,这需要反复解决平衡问题,这是计算效率低下的;绕过平衡调查耗时的一步,可以降低计算成本,但可能导致设计者找到一个次优化的解决办法。为了解决这种困境,我们提出了一个方法,既解决设计者和代理者的问题,又同时在单一的循环中解决他们的问题。具体地说,在每个循环中,设计者和代理者只走一步。然而,我们让设计者逐渐了解奖励措施对代理者的总体影响,保证在趋同后实现最佳性。提议的计划的趋同率也是为广泛的游戏而确定的。