Federated Learning (FL) has recently emerged as a revolutionary approach to collaborative training Machine Learning models. In particular, it enables decentralized model training while preserving data privacy, but its distributed nature makes it highly vulnerable to a severe attack known as Data Poisoning. In such scenarios, malicious clients inject manipulated data into the training process, thereby degrading global model performance or causing targeted misclassification. In this paper, we present a novel defense mechanism called GShield, designed to detect and mitigate malicious and low-quality updates, especially under non-independent and identically distributed (non-IID) data scenarios. GShield operates by learning the distribution of benign gradients through clustering and Gaussian modeling during an initial round, enabling it to establish a reliable baseline of trusted client behavior. With this benign profile, GShield selectively aggregates only those updates that align with the expected gradient patterns, effectively isolating adversarial clients and preserving the integrity of the global model. An extensive experimental campaign demonstrates that our proposed defense significantly improves model robustness compared to the state-of-the-art methods while maintaining a high accuracy of performance across both tabular and image datasets. Furthermore, GShield improves the accuracy of the targeted class by 43\% to 65\% after detecting malicious and low-quality clients.
翻译:联邦学习(FL)作为一种革命性的协作训练机器学习模型方法,近年来备受关注。它能够在保护数据隐私的同时实现去中心化的模型训练,但其分布式特性使其极易遭受一种名为数据投毒的严重攻击。在此类攻击场景中,恶意客户端向训练过程注入被篡改的数据,从而降低全局模型的性能或导致有针对性的错误分类。本文提出了一种名为GShield的新型防御机制,旨在检测并缓解恶意及低质量更新,尤其在数据非独立同分布(non-IID)场景下。GShield通过在初始轮次中通过聚类和高斯建模学习良性梯度的分布,从而建立一个可信客户端行为的可靠基线。基于此良性行为画像,GShield选择性地聚合那些符合预期梯度模式的更新,有效隔离对抗性客户端并维护全局模型的完整性。大量实验表明,与现有先进方法相比,我们提出的防御机制显著提升了模型鲁棒性,同时在表格和图像数据集上均保持了较高的性能准确率。此外,在检测出恶意及低质量客户端后,GShield将目标类别的准确率提升了43%至65%。