Member inference (MI) attacks aim to determine if a specific data sample was used to train a machine learning model. Thus, MI is a major privacy threat to models trained on private sensitive data, such as medical records. In MI attacks one may consider the black-box settings, where the model's parameters and activations are hidden from the adversary, or the white-box case where they are available to the attacker. In this work, we focus on the latter and present a novel MI attack for it that employs influence functions, or more specifically the samples' self-influence scores, to perform the MI prediction. We evaluate our attack on CIFAR-10, CIFAR-100, and Tiny ImageNet datasets, using versatile architectures such as AlexNet, ResNet, and DenseNet. Our attack method achieves new state-of-the-art results for both training with and without data augmentations. Code is available at https://github.com/giladcohen/sif_mi_attack.
翻译:成员推断( MI) 攻击的目的是确定是否使用了特定的数据样本来训练机器学习模型。 因此, MI是对医疗记录等私人敏感数据培训模型的重大隐私威胁。 在 MI 攻击中, 人们可以考虑黑箱设置, 模型的参数和启动功能被隐藏在对手的身上, 或者白箱案例, 袭击者可以使用这些参数和启动功能。 在这项工作中, 我们关注后者, 并展示一个使用影响功能的新型MI攻击, 或更具体地说, 样本的自我影响分数, 来进行MI 预测。 我们用诸如 AlexNet、 ResNet 和 DenseNet 等多功能结构来评估我们对CIFAR- 10 、 CIFAR- 100 和 Tiny 图像网络 数据集的攻击。 我们的攻击方法在使用数据增强和不使用数据增强的训练中都取得了新的艺术效果 。 代码可在 https:// github.com/ giladcohen/ sif_mi_ arection.