To regulate a social system comprised of self-interested agents, economic incentives (e.g., taxes, tolls, and subsidies) are often required to induce a desirable outcome. This incentive design problem naturally possesses a bi-level structure, in which an upper-level "designer" modifies the payoffs of the agents with incentives while anticipating the response of the agents at the lower level, who play a non-cooperative game that converges to an equilibrium. The existing bi-level optimization algorithms developed in machine learning raise a dilemma when applied to this problem: anticipating how incentives affect the agents at equilibrium requires solving the equilibrium problem repeatedly, which is computationally inefficient; bypassing the time-consuming step of equilibrium-finding can reduce the computational cost, but may lead the designer to a sub-optimal solution. To address such a dilemma, we propose a method that tackles the designer's and agents' problems simultaneously in a single loop. In particular, at each iteration, both the designer and the agents only move one step based on the first-order information. In the proposed scheme, although the designer does not solve the equilibrium problem repeatedly, it can anticipate the overall influence of the incentives on the agents, which guarantees optimality. We prove that the algorithm converges to the global optima at a sublinear rate for a broad class of games.
翻译:为了规范由自我利益代理人组成的社会制度,往往需要经济激励(例如税收、通行费和补贴)来促成一个理想的结果。这种激励设计问题自然具有双层结构,在这种结构中,高层“设计者”用奖励来改变代理人的付款,同时预测下层代理者的反应,他们玩的是一种不合作的游戏,这种游戏与平衡一致。在机器学习中开发的现有双级优化算法在应用这一问题时会带来一个两难境地:预测激励如何影响均衡代理人,需要反复解决均衡问题,而平衡问题在计算上效率低下;绕过平衡调查中耗时的一步可以降低计算成本,但可能导致设计者采取次优的解决方案。为了解决这种困境,我们建议一种方法,在单一的循环中同时解决设计者和代理人的问题。特别是每次循环中,设计者和代理人都只能根据第一阶级信息迈出一步。在拟议的计划中,虽然设计者们并没有选择一个最优水平的汇率,但可以反复地预测一个最高水平的汇率,我们无法选择一种最高水平的汇率。