Federated learning (FL) allows participants to jointly train a machine learning model without sharing their private data with others. However, FL is vulnerable to poisoning attacks such as backdoor attacks. Consequently, a variety of defenses have recently been proposed, which have primarily utilized intermediary states of the global model (i.e., logits) or distance of the local models (i.e., L2-norm) from the global model to detect malicious backdoors. However, as these approaches directly operate on client updates, their effectiveness depends on factors such as clients' data distribution or the adversary's attack strategies. In this paper, we introduce a novel and more generic backdoor defense framework, called BayBFed, which proposes to utilize probability distributions over client updates to detect malicious updates in FL: it computes a probabilistic measure over the clients' updates to keep track of any adjustments made in the updates, and uses a novel detection algorithm that can leverage this probabilistic measure to efficiently detect and filter out malicious updates. Thus, it overcomes the shortcomings of previous approaches that arise due to the direct usage of client updates; as our probabilistic measure will include all aspects of the local client training strategies. BayBFed utilizes two Bayesian Non-Parametric extensions: (i) a Hierarchical Beta-Bernoulli process to draw a probabilistic measure given the clients' updates, and (ii) an adaptation of the Chinese Restaurant Process (CRP), referred by us as CRP-Jensen, which leverages this probabilistic measure to detect and filter out malicious updates. We extensively evaluate our defense approach on five benchmark datasets: CIFAR10, Reddit, IoT intrusion detection, MNIST, and FMNIST, and show that it can effectively detect and eliminate malicious updates in FL without deteriorating the benign performance of the global model.
翻译:联邦学习(FL) 使参与者能够联合培训机器学习模式,而不必与其他人分享其私人数据。然而,FL 很容易被毒害,比如幕后攻击。因此,最近提出了各种防御方案,主要利用全球模型的中间状态(即登录)或全球模型(即L2-norm)的距离(即L2-norm),以探测恶意后门。然而,由于这些方法直接针对客户更新,其有效性取决于客户数据分发或对手攻击战略等因素。在本文件中,我们引入了一个新颖的、更通用的幕后防御框架,称为BayBrebred,它提议利用客户更新时的概率分布来检测恶意更新FL:它根据客户更新时的准确度计算一个概率性指标,跟踪更新中的任何调整,使用新的检测算法,从而利用这一不稳定性指标来高效地检测和过滤恶意更新。 因此,它克服了先前因直接使用客户更新而出现的方法的缺点。 用于Bebreal-Bervical 更新的Bervial客户评估, 一种不稳性指标的衡量方法。