As a decentralized model training method, federated learning is designed to integrate the isolated data islands and protect data privacy. Recent studies, however, have demonstrated that the Generative Adversarial Network (GAN) based attacks can be used in federated learning to learn the distribution of the victim's private dataset and accordingly reconstruct human-distinguishable images. In this paper, we exploit defenses against GAN-based attacks in federated learning, and propose a framework, Anti-GAN, to prevent attackers from learning the real distribution of the victim's data. The core idea of Anti-GAN is to corrupt the visual features of the victim's private training images, such that the images restored by the attacker are indistinguishable to human eyes. Specifically, in Anti-GAN, the victim first projects the personal dataset onto a GAN's generator, then mixes the fake images generated by the generator with the real images to obtain the training dataset, which will be fed into the federated model for training. We redesign the structure of the victim's GAN to encourage it to learn the classification features (instead of the visual features) of the real images. We further introduce an unsupervised task to the GAN model for obfuscating the visual features of the generated images. The experiments demonstrate that Anti-GAN can effectively prevent the attacker from learning the distribution of the private images, meanwhile causing little harm to the accuracy of the federated model.
翻译:作为一种分散式模式培训方法,联合会式学习旨在整合孤立的数据岛屿并保护数据隐私。然而,最近的研究表明,基于GAN的General Aversarial Network(GAN)攻击可以用于联合会式学习,学习受害者私人数据集的分布,从而重建人类可区分的图像。在本文中,我们利用在联盟式学习中针对GAN攻击的防御,并提出了一个框架,即反GAN,以防止攻击者了解受害者数据的真实分布。反GAN的核心思想是腐蚀受害者的私人培训图像的视觉特征,这样攻击者恢复的图像就能够对人类眼睛造成分辨。具体地说,在AntiGAN,受害者首先将个人数据投放到GAN的发电机上,然后将发电机产生的假图像与真实图像混为一体,以获得培训数据集的准确性模型。我们重新设计了受害者GAN的结构,以便鼓励受害者对图像的图像进行反伤害性分析,从而有效地了解G的图像的图像分类。我们用GAN的图像的图像的视觉模型演示,从而将G的图像进行真正的图像升级。