Recent studies have shown that federated learning (FL) is vulnerable to poisoning attacks that inject a backdoor into the global model. These attacks are effective even when performed by a single client, and undetectable by most existing defensive techniques. In this paper, we propose Backdoor detection via Feedback-based Federated Learning (BAFFLE), a novel defense to secure FL against backdoor attacks. The core idea behind BAFFLE is to leverage data of multiple clients not only for training but also for uncovering model poisoning. We exploit the availability of diverse datasets at the various clients by incorporating a feedback loop into the FL process, to integrate the views of those clients when deciding whether a given model update is genuine or not. We show that this powerful construct can achieve very high detection rates against state-of-the-art backdoor attacks, even when relying on straightforward methods to validate the model. Through empirical evaluation using the CIFAR-10 and FEMNIST datasets, we show that by combining the feedback loop with a method that suspects poisoning attempts by assessing the per-class classification performance of the updated model, BAFFLE reliably detects state-of-the-art backdoor attacks with a detection accuracy of 100% and a false-positive rate below 5%. Moreover, we show that our solution can detect adaptive attacks aimed at bypassing the defense.
翻译:最近的研究表明,联谊学习(FL)很容易被中毒袭击,这些袭击将后门注入全球模型。这些袭击即使由单一客户实施,也是有效的,而且大多数现有防御技术都无法检测。在本文件中,我们提议通过基于反馈的Federal Learning(BaFFLE)进行后门检测,这是确保FLE不受后门袭击的新防御手段。BAFFLE的核心思想是利用多个客户的数据,不仅用于培训,也用于发现模式中毒。我们利用不同客户的不同数据集的可用性,将反馈回路纳入FL进程,在决定某个模式更新是否真实时,将这些客户的观点整合起来。我们表明,这一强大的建筑能够通过基于反馈的检测率很高,防止最先进的后门袭击,即使依靠直截了当的方法来验证该模式。我们通过使用CIFAR-10和FEMNIST数据集进行实证评估,通过将反馈回路与怀疑中毒尝试的方法相结合,评估更新模型的单级分类绩效,BAFFLE可靠地检测了这些客户的观点,在确定某个模式更新后的更新模型更新模型是否真实性更新后门攻击率。我们用来检测了100号的后门攻击的精确度,我们用来检测了一种定位,从而测量了一种定位的后方位定位,从而测量了我们可以检测到5次攻击。