Federated learning has arisen as a mechanism to allow multiple participants to collaboratively train a model without sharing their data. In these settings, participants (workers) may not trust each other fully; for instance, a set of competitors may collaboratively train a machine learning model to detect fraud. The workers provide local gradients that a central server uses to update a global model. This global model can be corrupted when Byzantine workers send malicious gradients, which necessitates robust methods for aggregating gradients that mitigate the adverse effects of Byzantine inputs. Existing robust aggregation algorithms are often computationally expensive and only effective under strict assumptions. In this paper, we introduce LayerwisE Gradient AggregatTiOn (LEGATO), an aggregation algorithm that is, by contrast, scalable and generalizable. Informed by a study of layer-specific responses of gradients to Byzantine attacks, LEGATO employs a dynamic gradient reweighing scheme that is novel in its treatment of gradients based on layer-specific robustness. We show that LEGATO is more computationally efficient than multiple state-of-the-art techniques and more generally robust across a variety of attack settings in practice. We also demonstrate LEGATO's benefits for gradient descent convergence in the absence of an attack.
翻译:联邦学习已成为一种机制,使多个参与者能够合作培训一个模型而无需分享数据。在这些环境中,参与者(工人)可能不完全信任对方;例如,一组竞争者可能合作培训机器学习模型以发现欺诈。工人提供中央服务器用于更新全球模型的本地梯度。当拜占庭工人发送恶意梯度时,这一全球模型可能会被腐蚀,这就要求采用强有力的方法来汇总能够减轻拜占庭投入不利影响的梯度。在严格的假设下,现有的稳健的汇总算法往往计算费用昂贵,而且只能有效。在本文中,我们引入了“层宽度增长”AgregrattiOn(LEGATOTO),这是一种总算算算算算法,通过对Byzantine袭击的梯度反应进行一项研究,使全球模型具有一定层次的梯度反应,从而需要一种动态的梯度调整办法,这种办法在根据特定层的强度处理梯度基础上具有新颖的梯度。我们显示,在严格的假设下,GEGATO比多种状态技术更具有计算效率,在不同的攻击中也显示,在不同的攻击环境中,我们还展示了渐进性地表明,在不同的攻击中,在不同的攻击环境中,我们也具有较强。