How can non-communicating agents learn to share congested resources efficiently? This is a challenging task when the agents can access the same resource simultaneously (in contrast to multi-agent multi-armed bandit problems) and the resource valuations differ among agents. We present a fully distributed algorithm for learning to share in congested environments and prove that the agents' regret with respect to the optimal allocation is poly-logarithmic in the time horizon. Performance in the non-asymptotic regime is illustrated in numerical simulations. The distributed algorithm has applications in cloud computing and spectrum sharing. keywords: Distributed learning, congestion games, poly-logarithmic regret.
翻译:非通信代理商如何学会高效率地分享凝聚资源?当代理商能够同时获取同一资源(相对于多剂多武装土匪问题)和不同代理商的资源估值时,这是一项具有挑战性的任务。我们提出了一个完全分布的算法,用于学习在凝聚环境中分享,并证明代理商对最佳分配的遗憾在时间范围内是多对数的。非救济制度的绩效在数字模拟中加以说明。分布式算法在云计算和频谱共享中都有应用。关键词:分布式学习、拥堵游戏、多对数的遗憾。