Federated Learning (FL) is a collaborative machine learning approach allowing participants to jointly train a model without having to share their private, potentially sensitive local datasets with others. Despite its benefits, FL is vulnerable to so-called backdoor attacks, in which an adversary injects manipulated model updates into the federated model aggregation process so that the resulting model will provide targeted false predictions for specific adversary-chosen inputs. Proposed defenses against backdoor attacks based on detecting and filtering out malicious model updates consider only very specific and limited attacker models, whereas defenses based on differential privacy-inspired noise injection significantly deteriorate the benign performance of the aggregated model. To address these deficiencies, we introduce FLAME, a defense framework that estimates the sufficient amount of noise to be injected to ensure the elimination of backdoors. To minimize the required amount of noise, FLAME uses a model clustering and weight clipping approach. This ensures that FLAME can maintain the benign performance of the aggregated model while effectively eliminating adversarial backdoors. Our evaluation of FLAME on several datasets stemming from application areas including image classification, word prediction, and IoT intrusion detection demonstrates that FLAME removes backdoors effectively with a negligible impact on the benign performance of the models.
翻译:联邦学习联合会(FL)是一种协作式的机械学习方法,使参与者能够联合培训一个模型,而不必与其他人分享其私人的、潜在敏感的地方数据集。尽管它有其好处,但FL很容易受到所谓的幕后攻击,在这种攻击中,对手将经过操纵的模型更新模型输入到联盟式模型集成过程,这样产生的模型将为具体的对手选择的输入提供有针对性的虚假预测。基于发现和过滤恶意模型更新的幕后攻击拟议防御只考虑非常具体和有限的攻击者模型,而基于不同隐私激发的噪音注入的防御则大大恶化了综合模型的良好性能。为了解决这些缺陷,我们引入FLAME,这是一个国防框架,用于估计需要注入的足够噪音数量以确保消除后门;为了最大限度地减少所需的噪音,FLAME,使用模型组合和重量剪裁方法。这确保FLAME能够保持综合模型的良好性能,同时有效消除对后门的对抗性能。我们对FLAME的一些数据集的评估大大地反映了应用领域,包括图像分类、单词预测和IOT良性反射模型。