We consider the traffic assignment problem in nonatomic routing games where the players' cost functions may be subject to random fluctuations (e.g., weather disturbances, perturbations in the underlying network, etc.). We tackle this problem from the viewpoint of a control interface that makes routing recommendations based solely on observed costs and without any further knowledge of the system's governing dynamics -- such as the network's cost functions, the distribution of any random events affecting the network, etc. In this online setting, learning methods based on the popular exponential weights algorithm converge to equilibrium at an $\mathcal{O}({1/\sqrt{T}})$ rate: this rate is known to be order-optimal in stochastic networks, but it is otherwise suboptimal in static networks. In the latter case, it is possible to achieve an $\mathcal{O}({1/T^{2}})$ equilibrium convergence rate via the use of finely tuned accelerated algorithms; on the other hand, these accelerated algorithms fail to converge altogether in the presence of persistent randomness, so it is not clear how to achieve the "best of both worlds" in terms of convergence speed. Our paper seeks to fill this gap by proposing an adaptive routing algortihm with the following desirable properties: $(i)$ it seamlessly interpolates between the $\mathcal{O}({1/T^{2}})$ and $\mathcal{O}({1/\sqrt{T}})$ rates for static and stochastic environments respectively; $(ii)$ its convergence speed is polylogarithmic in the number of paths in the network; ${(iii)}$ the method's per-iteration complexity and memory requirements are both linear in the number of nodes and edges in the network; and ${(iv)}$ it does not require any prior knowledge of the problem's parameters.
翻译:我们从一个控制界面的角度来解决这个问题,该控制界面仅根据观察到的成本,在不进一步了解系统管理动态的情况下,提出路径建议,例如网络的成本功能、影响网络的任何随机事件的分布等。在这个在线设置中,基于流行的指数重量算法的学习方法会以美元/摩擦{O}({1/ sqrt{T}) 参数率的波动率趋同为平衡(例如,天气动荡、基础网络中的扰动) 。我们从控制界面的角度来解决这个问题,该控制界面只根据观察的成本和不进一步了解系统管理动态,例如网络的成本功能、影响网络中的任何随机事件分布。在这个在线设置中,基于流行指数重量算法的学习方法会以美元/ massalcal{O} ({1.\qrrt{T} ) 的汇率速度速度为平衡:这个速度是已知的,而Otral_ral_ral_ral_al_ration 网络中则需要以美元/ral_ral_al_al_al_al_al_al 时间来填补它的任何差距。