Recent advances in large-scale distributed learning algorithms have enabled communication-efficient training via SIGNSGD. Unfortunately, a major issue continues to plague distributed learning: namely, Byzantine failures may incur serious degradation in learning accuracy. This paper proposes ELECTION CODING, a coding-theoretic framework to guarantee Byzantine-robustness for SIGNSGD WITH MAJORITY VOTE, which uses minimum worker-master communication in both directions. The suggested framework explores new information-theoretic limits of finding the majority opinion when some workers could be malicious, and paves the road to implement robust and efficient distributed learning algorithms. Under this framework, we construct two types of explicit codes, random Bernoulli codes and deterministic algebraic codes, that can tolerate Byzantine attacks with a controlled amount of computational redundancy. For the Bernoulli codes, we provide upper bounds on the error probability in estimating the majority opinion, which give useful insights into code design for tolerating Byzantine attacks. As for deterministic codes, we construct an explicit code which perfectly tolerates Byzantines, and provide tight upper/lower bounds on the minimum required computational redundancy. Finally, the Byzantine-tolerance of the suggested coding schemes is confirmed by deep learning experiments on Amazon EC2 using Python with MPI4py package.
翻译:大规模分布式学习算法的最近进展使得通过SGIGSGD进行通信效率培训成为了大规模分布式学习的最近进展。 不幸的是,一个重大问题继续困扰着分布式学习:拜占庭失败可能会在学习准确性方面造成严重退化。本文件建议ELECEPE CODINGINGINGINGINGNAD(一个编码理论框架),以保障与MAJORITY VOTE的拜占庭-罗布特(BYZANTine-robustDD)在两个方向上都使用最低限度的工人-船长沟通。建议的框架探索了在有些工人可能恶意时找到多数人意见的新的信息理论限度,并铺平了实施稳健和有效的分布式学习算法的道路。在这个框架内,我们建造了两种明确的代码,即随机伯南利代码和确定性代数的代数,可以容忍拜占星攻击,并控制了计算冗余冗余。 对于伯南联盟的观点,我们提供了估算多数人意见的错误概率的上限,这为容忍拜占庭攻击的代码设计提供了有用的洞洞洞察,对于确定式代码,对于确定性代码,我们构建一个明确的代码,我们用最精确的Byz坦式的模型来完全地容忍了Byz型的模型进行着式的实验。