Byzantine robustness has received significant attention recently given its importance for distributed and federated learning. In spite of this, we identify severe flaws in existing algorithms even when the data across the participants is identically distributed. First, we show realistic examples where current state of the art robust aggregation rules fail to converge even in the absence of any Byzantine attackers. Secondly, we prove that even if the aggregation rules may succeed in limiting the influence of the attackers in a single round, the attackers can couple their attacks across time eventually leading to divergence. To address these issues, we present two surprisingly simple strategies: a new robust iterative clipping procedure, and incorporating worker momentum to overcome time-coupled attacks. This is the first provably robust method for the standard stochastic optimization setting. Our code is open sourced at https://github.com/epfml/byzantine-robust-optimizer.
翻译:拜占庭的稳健性最近受到高度重视,因为它对分布式和联结式学习十分重要。 尽管如此,我们发现现有算法存在严重缺陷,即使参与者的数据分布相同。首先,我们展示了现实的例子,表明即使在没有拜占庭袭击者的情况下,当前先进的稳健聚合规则也未能趋同。第二,我们证明,即使综合规则能够成功地限制袭击者在一回合中的影响力,袭击者也可以在一段时间内将其袭击合并在一起,最终导致分歧。为了解决这些问题,我们提出了两个令人惊讶的简单战略:一个新的稳健的迭接合剪辑程序,以及纳入工人的动力以克服时间组合式袭击。这是标准随机优化设置的第一个可察觉的稳健方法。我们的代码在https://github.com/epfml/byzantine-robust-optimizer上公开来源。