In Byzantine robust distributed or federated learning, a central server wants to train a machine learning model over data distributed across multiple workers. However, a fraction of these workers may deviate from the prescribed algorithm and send arbitrary messages. While this problem has received significant attention recently, most current defenses assume that the workers have identical data. For realistic cases when the data across workers are heterogeneous (non-iid), we design new attacks which circumvent current defenses, leading to significant loss of performance. We then propose a simple resampling scheme that adapts existing robust algorithms to heterogeneous datasets at a negligible computational cost. We also theoretically and experimentally validate our approach, showing that combining resampling with existing robust algorithms is effective against challenging attacks. Our work is the first to establish guaranteed convergence for the non-iid Byzantine robust problem under realistic assumptions.
翻译:拜占庭强力分布或联合学习,中央服务器希望对机器学习模式进行培训,而不是多工人之间分布的数据。然而,这些工人中有一小部分可能偏离规定的算法并发送任意信息。虽然这个问题最近受到极大关注,但大多数当前的防御假设工人拥有相同的数据。对于工人之间数据各不相同(非二分)的现实案例,我们设计新的攻击,绕过目前的防御,导致显著的性能损失。然后我们提出一个简单的重试计划,将现有的强势算法调整为以微不足道的计算成本计算的多种数据集。我们还在理论上和实验上验证了我们的方法,表明与现有强势算法相结合对于挑战性攻击是有效的。我们的工作是首先在现实假设下为非二分贝占庭强力问题建立一致的保障。