Distributed learning paradigms such as federated learning often involve transmission of model updates, or gradients, over a network, thereby avoiding transmission of private data. However, it is possible for sensitive information about the training data to be revealed from such gradients. Prior works have demonstrated that labels can be revealed analytically from the last layer of certain models (e.g., ResNet), or they can be reconstructed jointly with model inputs by using Gradients Matching [Zhu et al'19] with additional knowledge about the current state of the model. In this work, we propose a method to discover the set of labels of training samples from only the gradient of the last layer and the id to label mapping. Our method is applicable to a wide variety of model architectures across multiple domains. We demonstrate the effectiveness of our method for model training in two domains - image classification, and automatic speech recognition. Furthermore, we show that existing reconstruction techniques improve their efficacy when used in conjunction with our method. Conversely, we demonstrate that gradient quantization and sparsification can significantly reduce the success of the attack.
翻译:联合学习等分布式学习模式往往涉及在网络上传输模型更新或梯度,从而避免传输私人数据。然而,有可能从这些梯度中披露有关培训数据的敏感信息。先前的工作表明,标签可以从某些模型的最后一层(例如ResNet)中分析披露,或者可以通过使用对模型现状有更多了解的 " Gradients Matching " [Zhu et al'19] 来与模型输入模型一同进行重建。在这项工作中,我们建议了一种方法,从最后一个层的梯度和标签绘图的id 中发现一组培训样本的标签。我们的方法适用于多个领域的各种模型结构。我们展示了我们在两个领域进行模型培训的方法的有效性 -- -- 图像分类和自动语音识别。此外,我们表明,在与我们的方法结合使用时,现有的重建技术提高了其效力。相反,我们证明梯度四分化和吸附式能够大大降低攻击的成功。