In this paper, we investigate Nash-regret minimization in congestion games, a class of games with benign theoretical structure and broad real-world applications. We first propose a centralized algorithm based on the optimism in the face of uncertainty principle for congestion games with (semi-)bandit feedback, and obtain finite-sample guarantees. Then we propose a decentralized algorithm via a novel combination of the Frank-Wolfe method and G-optimal design. By exploiting the structure of the congestion game, we show the sample complexity of both algorithms depends only polynomially on the number of players and the number of facilities, but not the size of the action set, which can be exponentially large in terms of the number of facilities. We further define a new problem class, Markov congestion games, which allows us to model the non-stationarity in congestion games. We propose a centralized algorithm for Markov congestion games, whose sample complexity again has only polynomial dependence on all relevant problem parameters, but not the size of the action set.
翻译:在本文中,我们调查了堵塞游戏中的Nash-regret最小化问题,这是一个具有良性理论结构和广泛现实应用的游戏类别。我们首先提出一种基于乐观的集中算法,这种算法基于对使用(半)bandit反馈的堵塞游戏的不确定性原则的乐观态度,并获得有限的抽样保证。然后我们通过弗兰克-沃夫方法和G-最佳设计的新组合,提出一种分散算法。通过利用堵塞游戏的结构,我们显示了两种算法的样本复杂性仅取决于玩家的数量和设施的数量,而并不取决于成套动作的规模,而从设施的数量来看,这可能非常大。我们进一步定义了一个新的问题类别,即马尔科夫堵塞游戏,使我们能够模拟堵塞游戏中的非常态性。我们建议了马可夫热游戏的集中算法,其样本复杂性再次仅对所有相关的问题参数具有多位依赖性,而不是行动组合的大小。