在联邦学习中以知识蒸馏为基础的后门攻击 (A Knowledge Distillation-Based Backdoor Attack in Federated Learning)

Federated Learning (FL) is a novel framework of decentralized machine learning. Due to the decentralized feature of FL, it is vulnerable to adversarial attacks in the training procedure, e.g. , backdoor attacks. A backdoor attack aims to inject a backdoor into the machine learning model such that the model will make arbitrarily incorrect behavior on the test sample with some specific backdoor trigger. Even though a range of backdoor attack methods of FL has been introduced, there are also methods defending against them. Many of the defending methods utilize the abnormal characteristics of the models with backdoor or the difference between the models with backdoor and the regular models. To bypass these defenses, we need to reduce the difference and the abnormal characteristics. We find a source of such abnormality is that backdoor attack would directly flip the label of data when poisoning the data. However, current studies of the backdoor attack in FL are not mainly focus on reducing the difference between the models with backdoor and the regular models. In this paper, we propose Adversarial Knowledge Distillation(ADVKD), a method combine knowledge distillation with backdoor attack in FL. With knowledge distillation, we can reduce the abnormal characteristics in model result from the label flipping, thus the model can bypass the defenses. Compared to current methods, we show that ADVKD can not only reach a higher attack success rate, but also successfully bypass the defenses when other methods fails. To further explore the performance of ADVKD, we test how the parameters affect the performance of ADVKD under different scenarios. According to the experiment result, we summarize how to adjust the parameter for better performance under different scenarios. We also use several methods to visualize the effect of different attack and explain the effectiveness of ADVKD.

翻译：Fled Learning (FL) 是分散式机器学习的新框架。由于 FL 的分散式参数, 它很容易在培训程序中受到对抗性攻击, 例如, 后门攻击。后门攻击的目的是将一个后门输入机器学习模型, 这样模型会在测试样本中以某些特定的后门触发器进行任意错误的行为。尽管FL 的一系列后门攻击方法已经引入, 但也有一些防守方法。许多防御方法使用了带有后门参数的模型的异常性能, 或者是带有后门和常规模型的差别。要绕过这些防御, 我们需要减少差异和异常特征。我们发现后门攻击的一个原因就是, 在污染数据时, 后门攻击会直接翻转数据标签。然而, FL 对后门攻击的当前研究并不主要侧重于减少后门模型和常规模型之间的差别。在本文中, 我们建议 Aversarial Onal Stillation (ADVKD), 一种方法不是将知识与后门攻击和常规模型相结合, 我们也可以在FL 的后门攻击中, 的反动性变换式的运行方法。因此, 将实验的性测试方法调整。我们也可以将实验的反向后路的反向, 我们的反向后路的反向后路测试方法。我们的反向后路的反向后路的反向, 变。