Recent advances in federated learning have demonstrated its promising capability to learn on decentralized datasets. However, a considerable amount of work has raised concerns due to the potential risks of adversaries participating in the framework to poison the global model for an adversarial purpose. This paper investigates the feasibility of model poisoning for backdoor attacks through \textit{rare word embeddings of NLP models} in text classification and sequence-to-sequence tasks. In text classification, less than 1\% of adversary clients suffices to manipulate the model output without any drop in the performance of clean sentences. For a less complex dataset, a mere 0.1\% of adversary clients is enough to poison the global model effectively. We also propose a technique specialized in the federated learning scheme called gradient ensemble, which enhances the backdoor performance in all experimental settings.
翻译:联盟式学习的最近进展表明,它有潜力学习分散的数据集,但是,大量工作引起了人们的关切,因为敌对方有可能参加为对抗目的毒化全球模型的框架,本文调查了通过文本分类和顺序顺序任务中NLP模型的文字嵌入,对后门攻击进行示范中毒的可行性。在文本分类中,敌方客户不到1 ⁇ 足以操纵模型输出而不减少干净的句子。对于不那么复杂的数据集,只有0.1 ⁇ 的敌方客户足以有效毒化全球模型。我们还建议采用称为梯度组合的联盟式学习计划专门技术,提高所有实验环境中的后门性能。