Implementations of SGD on distributed and multi-GPU systems creates new vulnerabilities, which can be identified and misused by one or more adversarial agents. Recently, it has been shown that well-known Byzantine-resilient gradient aggregation schemes are indeed vulnerable to informed attackers that can tailor the attacks (Fang et al., 2020; Xie et al., 2020b). We introduce MixTailor, a scheme based on randomization of the aggregation strategies that makes it impossible for the attacker to be fully informed. Deterministic schemes can be integrated into MixTailor on the fly without introducing any additional hyperparameters. Randomization decreases the capability of a powerful adversary to tailor its attacks, while the resulting randomized aggregation scheme is still competitive in terms of performance. For both iid and non-iid settings, we establish almost sure convergence guarantees that are both stronger and more general than those available in the literature. Our empirical studies across various datasets, attacks, and settings, validate our hypothesis and show that MixTailor successfully defends when well-known Byzantine-tolerant schemes fail.
翻译:在分布式和多GPU系统上实施SGD的SGD系统产生了新的弱点,这些弱点可由一个或多个对抗性代理人识别和滥用。最近,已经表明,众所周知的Byzantine抗御性梯度汇总计划确实容易受到能够调整攻击的知情攻击者的影响(Fang等人,2020年;Xie等人,2020年b)。我们引入了MixTailor计划,该计划基于集成战略的随机化,使攻击者无法充分了解攻击者。确定性计划可以并入飞行的MixTailor,而不引入任何额外的超参数。随机化降低了强大的对手调整攻击的能力,而由此产生的随机化汇总计划在性能方面仍然具有竞争力。对于iid和非iid两种环境,我们几乎可以肯定地确立比文献中提供的更强大和更加普遍的趋同性保证。我们在不同数据集、攻击和设置中进行的经验研究,验证了我们的假设,并表明MixTailor在众所周知的Byzant耐受攻击计划失败时成功地保护。