We study adversary-resilient stochastic distributed optimization, in which $m$ machines can independently compute stochastic gradients, and cooperate to jointly optimize over their local objective functions. However, an $\alpha$-fraction of the machines are $\textit{Byzantine}$, in that they may behave in arbitrary, adversarial ways. We consider a variant of this procedure in the challenging $\textit{non-convex}$ case. Our main result is a new algorithm SafeguardSGD which can provably escape saddle points and find approximate local minima of the non-convex objective. The algorithm is based on a new concentration filtering technique, and its sample and time complexity bounds match the best known theoretical bounds in the stochastic, distributed setting when no Byzantine machines are present. Our algorithm is very practical: it improves upon the performance of all prior methods when training deep neural networks, it is relatively lightweight, and it is the first method to withstand two recently-proposed Byzantine attacks.
翻译:我们研究的是对抗抗力强的随机分布优化,在这种优化中,机器可以独立计算随机梯度,并合作共同优化本地目标功能。然而,机器的负负负负值折射值是$\textit{Byzantine}$,这样它们的行为可能是任意的,对抗性的方式。我们在具有挑战性的$textit{non-convex}$案例中考虑了这一程序的变式。我们的主要结果就是一个新的算法“保障SGD”,它可以明显地避开马鞍点,找到近似本地的非convex目标的迷你。算法基于一种新的集中过滤技术,其样本和时间界限与已知的最佳理论界限相匹配,在没有Byzantine机器在场时分布。我们的算法非常实用:在训练深层神经网络时,它比以往所有方法的性能有所改进,是相对轻的,这是第一种能够承受最近设计的两起拜占庭攻击的方法。