Federated Learning (FL) is a distributed learning paradigm that enables different parties to train a model together for high quality and strong privacy protection. In this scenario, individual participants may get compromised and perform backdoor attacks by poisoning the data (or gradients). Existing work on robust aggregation and certified FL robustness does not study how hardening benign clients can affect the global model (and the malicious clients). In this work, we theoretically analyze the connection among cross-entropy loss, attack success rate, and clean accuracy in this setting. Moreover, we propose a trigger reverse engineering based defense and show that our method can achieve robustness improvement with guarantee (i.e., reducing the attack success rate) without affecting benign accuracy. We conduct comprehensive experiments across different datasets and attack settings. Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
翻译:联邦学习(FL)是一种分布式学习模式,它使不同方能够共同培训一个高质量和强力隐私保护模型。在这种情况下,个别参与者可能会通过毒害数据(或梯度)而受到损害和进行幕后攻击。关于强力聚合和认证FL稳健性的现有工作并不研究良性客户如何对全球模型(和恶意客户)产生影响。在这项工作中,我们从理论上分析了跨肾丧失、攻击成功率和这一环境的清洁准确性之间的联系。此外,我们提议了以触发反向工程为基础的防御,并表明我们的方法可以在不影响良性准确性的情况下实现强力改进(即降低攻击成功率)。我们在不同数据集和攻击环境进行综合实验。我们关于八种相互竞争的SOTA防御方法的结果显示了我们方法在单发式和连续FL后门攻击上的经验优势。