In this paper we consider a distributed online learning setting for joint regret with communication constraints. This is a multi-agent setting in which in each round $t$ an adversary activates an agent, which has to issue a prediction. A subset of all the agents may then communicate a $b$-bit message to their neighbors in a graph. All agents cooperate to control the joint regret, which is the sum of the losses of the agents minus the losses evaluated at the best fixed common comparator parameters $\pmb{u}$. We provide a comparator-adaptive algorithm for this setting, which means that the joint regret scales with the norm of the comparator $\|\pmb{u}\|$. To address communication constraints we provide deterministic and stochastic gradient compression schemes and show that with these compression schemes our algorithm has worst-case optimal regret for the case that all agents communicate in every round. Additionally, we exploit the comparator-adaptive property of our algorithm to learn the best partition from a set of candidate partitions, which allows different subsets of agents to learn a different comparator.
翻译:在本文中,我们考虑一个分布式在线学习环境,用于在通信限制下共同致歉。这是一个多试剂环境,每轮一对一对一对一对一对一的对手启动一个代理人,该代理人必须作出预测。所有代理人的一个子集可以随后在图表中向其邻居传递一美元-比特信息。所有代理人都合作控制联合遗憾,即代理人损失减去以最佳固定共同参照参照参数($\pmb{u}$)评估的损失的总和。我们为这一环境提供了一种参照方-适应算法,这意味着与参照方的规范($ ⁇ pmb{u ⁇ $)联合致歉比例表,这意味着与参照方的规范($ ⁇ pmb{ ⁇ _ ⁇ $)。为了解决通信限制,我们提供了确定性和随机性梯度压缩计划,并表明通过这些压缩计划,我们的算法对每轮所有代理人沟通的案件都有最坏的最佳遗憾。此外,我们利用我们算法的参照方-适应性财产从一套候选间隔中学习最佳的分差法,让不同的代理人学习不同的比较方学习不同的比较方。