This paper investigates the impact of feedback quantization on multi-agent learning. In particular, we analyze the equilibrium convergence properties of the well-known "follow the regularized leader" (FTRL) class of algorithms when players can only observe a quantized (and possibly noisy) version of their payoffs. In this information-constrained setting, we show that coarser quantization triggers a qualitative shift in the convergence behavior of FTRL schemes. Specifically, if the quantization error lies below a threshold value (which depends only on the underlying game and not on the level of uncertainty entering the process or the specific FTRL variant under study), then (i) FTRL is attracted to the game's strict Nash equilibria with arbitrarily high probability; and (ii) the algorithm's asymptotic rate of convergence remains the same as in the non-quantized case. Otherwise, for larger quantization levels, these convergence properties are lost altogether: players may fail to learn anything beyond their initial state, even with full information on their payoff vectors. This is in contrast to the impact of quantization in continuous optimization problems, where the quality of the obtained solution degrades smoothly with the quantization level.
翻译:本文调查了反馈量化对多试剂学习的影响。 特别是, 我们分析众所周知的“ 遵循正规化领导者” (FTRL) 算法类别的均衡趋同特性, 当玩家只能观察其报酬的量化( 可能很吵) 版本时, 玩家只能观察其报酬的量化( 也可能很吵) 。 在这种受信息限制的环境下, 我们显示, 粗略的定量化触发了FTRL 计划趋同行为的质量转变。 具体地说, 如果量化错误低于临界值( 仅取决于进入过程的基本游戏, 而不是不确定性的程度 ), 然后( i) FTRL 被该游戏严格的Nash 等平衡( 任意高概率 ) 吸引; 以及 (ii) 算法的趋同率与非量化化情况一样。 否则, 在更大的量化水平上, 这些趋同性特性会完全消失: 玩家可能没有学到超越初始状态的任何东西, 甚至是获得关于其支付矢量的全面信息 。 这与 Qalizalizalalizaliziztion lation sution sution sution legal dism dislation legal dislated le legism dism suol sution le sution sution suol suol suol leg leg legil suol su legil suol suol suol suol legil le le le le le sution le le le le le legil legil legil leg leg leg leg leg leg leg leg leg leg leg le le le le le le le le le le le le leg legal le le le le le le le le le le le le le le le le le le le le legal legal legal et et le le le le le le le le