Under the Markov decision process (MDP) congestion game framework, we study the problem of enforcing global constraints using tolls on a population of players with stochastic dynamics and coupled congestion costs. Existing work demonstrates that by enforcing identical tolls on every player, the optimal joint strategy for the playing population can be shifted to satisfy global design constraints. However, computing the minimum tolling value for constraint satisfaction requires explicit modelling of the congestion cost as a function of the playing population. In this paper, we assume that both the playing population and the constraint-enforcing authority, the game designer, lack such a model. Instead, the game designer can enforce tolls on a gaming instance that responds by approximating the optimal joint strategy under any toll. Under these assumptions, we develop a myopic algorithm that enables the game designer to compute the minimum tolling value, and prove that, up to the approximation error made by the gaming instance, our algorithm not only converges to the correct toll, but will guarantee average constraint satisfaction during the iterative process. Finally, we demonstrate how our model and algorithm can be applied to the profit-seeking ride-share driver population of Manhattan, New York City to optimally reduce traffic congestion using tolls.
翻译:在Markov决定(MDP)拥堵游戏框架下,我们研究使用对具有随机动态和同时拥堵成本的玩家人口进行收费来实施全球限制的问题。现有工作表明,通过对每个玩家实施相同的收费,可以改变游戏玩家的最佳联合战略,以满足全球设计限制。然而,计算制约满意度的最低定价值需要明确模拟作为玩家的功能的拥堵成本。在本文中,我们假设玩家人口和约束强制实施当局、游戏设计者都缺乏这样一个模型。相反,游戏设计者可以对一个通过在任何收费下接近最佳联合战略来应对的赌博实例强制实施收费。根据这些假设,我们开发了一种近似算法,使游戏设计者能够计算最小的定价值,并证明,除了游戏游戏游戏游戏游戏场的近似错误之外,我们的算法不仅与正确的收费一致,而且保证在迭接过程中平均的制约满意度。最后,我们展示了我们的模型和算法可以如何将最佳交通流量应用到曼哈顿州的交通。