We investigate both stationary and time-varying, nonmonotone generalized Nash equilibrium problems that exhibit symmetric interactions among the agents, which are known to be potential. As may happen in practical cases, however, we envision a scenario in which the formal expression of the underlying potential function is not available, and we design a semi-decentralized Nash equilibrium seeking algorithm. In the proposed two-layer scheme, a coordinator iteratively integrates the (possibly noisy and sporadic) agents' feedback to learn the pseudo-gradients of the agents, and then design personalized incentives for them. On their side, the agents receive those personalized incentives, compute a solution to an extended game, and then return feedback measurements to the coordinator. In the stationary setting, our algorithm returns a Nash equilibrium in case the coordinator is endowed with standard learning policies, while it returns a Nash equilibrium up to a constant, yet adjustable, error in the time-varying case. As a motivating application, we consider the ridehailing service provided by several companies with mobility as a service orchestration, necessary to both handle competition among firms and avoid traffic congestion, which is also adopted to run numerical experiments verifying our results.
翻译:我们调查了固定的和时间变化的、非典型的普通的纳什均衡问题,这些问题显示了代理商之间的对称互动,已知这些问题是潜在的。但是,在实际情况下,我们可能会设想出一种没有潜在潜在功能的正式表现的情景,我们设计了一个半分散的纳什均衡寻求算法。在拟议的双层办法中,一名协调员反复整合了(可能吵闹和零散的)代理商的反馈,以了解代理商的假等级,然后设计个性化的激励机制。在代理商方面,代理商得到这些个性化的激励,对延长的游戏进行计算一个解决方案,然后将反馈测量结果归还给协调员。在固定情况下,我们的算法将纳什平衡转化为标准学习政策,同时将纳什平衡转化为时间变异的常态、但可调整的错误。作为一种激励性应用,我们认为若干具有流动性的公司所提供的搭配服务是处理公司竞争和避免交通拥堵的必备条件,同时也是进行数字性实验。