Federated Learning (FL) has been gaining popularity as a collaborative learning framework to train deep learning-based object detection models over a distributed population of clients. Despite its advantages, FL is vulnerable to model hijacking. The attacker can control how the object detection system should misbehave by implanting Trojaned gradients using only a small number of compromised clients in the collaborative learning process. This paper introduces STDLens, a principled approach to safeguarding FL against such attacks. We first investigate existing mitigation mechanisms and analyze their failures caused by the inherent errors in spatial clustering analysis on gradients. Based on the insights, we introduce a three-tier forensic framework to identify and expel Trojaned gradients and reclaim the performance over the course of FL. We consider three types of adaptive attacks and demonstrate the robustness of STDLens against advanced adversaries. Extensive experiments show that STDLens can protect FL against different model hijacking attacks and outperform existing methods in identifying and removing Trojaned gradients with significantly higher precision and much lower false-positive rates.
翻译:联邦学习(FL)作为一种分布式人口协作学习框架,已经越来越受到欢迎,用于训练基于深度学习的目标检测模型。尽管具有优势,FL容易受到模型劫持攻击。攻击者可以使用只有少数被攻击的客户端来植入特洛伊梯度,从而控制目标检测系统的恶意行为。本文介绍了STDLens,一种基于原则的方法,用于保护FL免受此类攻击。我们首先调查现有的缓解机制,并分析它们因梯度空间聚类分析中固有误差而造成的故障。基于这些见解,我们引入了一个三级取证框架,以识别和驱逐特洛伊梯度,并在FL过程中重新获得性能。我们考虑了三种自适应攻击,并证明了STDLens对高级对手的鲁棒性。广泛的实验表明,STDLens可以保护FL免受不同的模型劫持攻击,并以显著更高的精度和更低的误报率识别和删除特洛伊梯度,优于现有方法。