To induce a desired equilibrium in a social system comprised of self-interested agents, economic incentives (e.g., taxes, tolls, and subsidies) are often required to correct an inefficient outcome. Such an incentive design problem naturally possesses a bi-level structure, in which an upper-level "designer" revises the payoffs of the agents with incentives while anticipating the response of the agents, who play a non-cooperative game at the lower level. The existing bi-level optimization algorithms developed in machine learning raise a dilemma when applied to this problem: anticipating how incentives affect the agents at equilibrium requires solving the equilibrium problem repeatedly, which is computationally inefficient; bypassing the time-consuming step of equilibrium-finding can reduce the computational cost, but may lead to a sub-optimal solution. Therefore, we propose an efficient method that tackles the designer's and agents' problems simultaneously in a single loop. At each iteration, both the designer and the agents only move one step based on the first-order information. In the proposed scheme, although the designer does not solve the equilibrium problem repeatedly, it can anticipate the overall influence of the incentives on the agents, which guarantees optimality. We prove that the algorithm converges to the global optima at a sublinear rate for a broad class of games.
翻译:为了在由自我利益因素组成的社会制度中实现理想的平衡,往往需要经济奖励(例如税收、通行费和补贴)来纠正效率低下的结果。这种奖励性设计问题自然具有双层结构,在这种结构中,高层“设计者”用奖励来修改代理人的付款,同时预测代理人的反应,他们在较低层次上玩不合作游戏。在机器学习中开发的双级优化算法在应用这一问题时会带来两步两步两步的困境:预测奖励如何影响平衡性因素,需要反复解决平衡问题,而平衡问题在计算上效率低下;绕过平衡调查中耗时的步骤可以降低计算成本,但可能导致次优化的解决办法。因此,我们建议一种有效的方法,同时解决设计者和代理人的问题,在较低层次上玩不合作游戏。在每一次循环中,设计者和代理人只根据一级信息移动一步。在拟议办法中,虽然设计者没有反复解决平衡问题,但是在计算效率上效率低下;绕过平衡性调查中耗费时间的一步,可以降低计算成本,但可能导致一个次的次最佳解决办法。因此,我们可以选择一种全球激励因素的总体趋同率。