Given the distributed nature, detecting and defending against the backdoor attack under federated learning (FL) systems is challenging. In this paper, we observe that the cosine similarity of the last layer's weight between the global model and each local update could be used effectively as an indicator of malicious model updates. Therefore, we propose CosDefense, a cosine-similarity-based attacker detection algorithm. Specifically, under CosDefense, the server calculates the cosine similarity score of the last layer's weight between the global model and each client update, labels malicious clients whose score is much higher than the average, and filters them out of the model aggregation in each round. Compared to existing defense schemes, CosDefense does not require any extra information besides the received model updates to operate and is compatible with client sampling. Experiment results on three real-world datasets demonstrate that CosDefense could provide robust performance under the state-of-the-art FL poisoning attack.
翻译:在联邦学习系统中,检测和防御后门攻击是具有挑战性的,因其分布式的特点。本文观察到全局模型与每个本地更新的最后一层权重之间的余弦相似度可以有效地用作恶意模型更新的指标。因此,我们提出了一种称为 "CosDefense" 的基于余弦相似度的攻击者检测算法。具体而言,在 CosDefense 的情况下,服务器计算全局模型与每个客户端更新的最后一层权重之间的余弦相似度分数,标记余弦相似度远高于平均值的恶意客户端,然后将其从每轮的模型聚合中过滤掉。与现有的防御方案相比,CosDefense不需要除收到的模型更新之外的任何额外信息即可操作,并且与客户端抽样兼容。在三个真实数据集上的实验结果表明,CosDefense可以在最先进的联邦学习中提供强大的性能,防御有害攻击。