Given the distributed nature, detecting and defending against the backdoor attack under federated learning (FL) systems is challenging. In this paper, we observe that the cosine similarity of the last layer's weight between the global model and each local update could be used effectively as an indicator of malicious model updates. Therefore, we propose CosDefense, a cosine-similarity-based attacker detection algorithm. Specifically, under CosDefense, the server calculates the cosine similarity score of the last layer's weight between the global model and each client update, labels malicious clients whose score is much higher than the average, and filters them out of the model aggregation in each round. Compared to existing defense schemes, CosDefense does not require any extra information besides the received model updates to operate and is compatible with client sampling. Experiment results on three real-world datasets demonstrate that CosDefense could provide robust performance under the state-of-the-art FL poisoning attack.
翻译:在分布式环境下,检测和防御联邦学习系统下的后门攻击是具有挑战性的。本文中,我们观察到,全局模型与每个本地更新之间的最后一层重量的余弦相似性可以有效地用作恶意模型更新的指标。因此,我们提出了一种基于余弦相似性的攻击者检测算法CosDefense。具体而言,在CosDefense下,服务器计算全局模型与每个客户端更新之间的最后一层重量的余弦相似度分数,标记得分远高于平均值的恶意客户端,并在每轮模型聚合中将它们过滤掉。与现有防御方案相比,CosDefense在操作时不需要除接收到的模型更新以外的任何额外信息,并且与客户端采样兼容。对三个真实数据集的实验结果表明,CosDefense可以在最先进的联邦学习中提供强大的性能,防御投毒攻击。