Federated learning has been proposed as a privacy-preserving machine learning framework that enables multiple clients to collaborate without sharing raw data. However, client privacy protection is not guaranteed by design in this framework. Prior work has shown that the gradient sharing strategies in federated learning can be vulnerable to data reconstruction attacks. In practice, though, clients may not transmit raw gradients considering the high communication cost or due to privacy enhancement requirements. Empirical studies have demonstrated that gradient obfuscation, including intentional obfuscation via gradient noise injection and unintentional obfuscation via gradient compression, can provide more privacy protection against reconstruction attacks. In this work, we present a new data reconstruction attack framework targeting the image classification task in federated learning. We show that commonly adopted gradient postprocessing procedures, such as gradient quantization, gradient sparsification, and gradient perturbation, may give a false sense of security in federated learning. Contrary to prior studies, we argue that privacy enhancement should not be treated as a byproduct of gradient compression. Additionally, we design a new method under the proposed framework to reconstruct the image at the semantic level. We quantify the semantic privacy leakage and compare with conventional based on image similarity scores. Our comparisons challenge the image data leakage evaluation schemes in the literature. The results emphasize the importance of revisiting and redesigning the privacy protection mechanisms for client data in existing federated learning algorithms.
翻译:联邦学习被提议为一个保护隐私的机器学习框架,使多个客户能够在不共享原始数据的情况下进行合作。然而,在这个框架中,客户隐私保护没有受到设计保障。先前的工作表明,联邦学习中的梯度共享战略可能易受数据重建袭击。实际上,考虑到通信成本高或隐私增强的要求,客户可能不会传输原始梯度。经验性研究显示,梯度模糊化,包括有意通过斜度噪音喷射和通过梯度压缩无意混淆来混淆,可以提供更大的隐私保护,防止重建袭击。在这项工作中,我们提出了一个新的数据重建攻击框架,目标是在混合学习中完成图像分类任务。我们显示,通常采用的梯度后处理程序,如梯度四分化、梯度宽度宽度和梯度渗透等,可能给进化学习带来虚假的安全感。与先前的研究相反,我们认为,不应当将增强隐私视为梯度压缩的副产品。此外,我们在拟议框架下设计了一个新的方法,用于在语义化层面重建图像分类,我们用类似的数据再分析方法将当前数据流流数据再对比。