Membership inference attacks (MIAs) against machine learning models can lead to serious privacy risks for the training dataset used in the model training. In this paper, we propose a novel and effective Neuron-Guided Defense method named NeuGuard against membership inference attacks (MIAs). We identify a key weakness in existing defense mechanisms against MIAs wherein they cannot simultaneously defend against two commonly used neural network based MIAs, indicating that these two attacks should be separately evaluated to assure the defense effectiveness. We propose NeuGuard, a new defense approach that jointly controls the output and inner neurons' activation with the object to guide the model output of training set and testing set to have close distributions. NeuGuard consists of class-wise variance minimization targeting restricting the final output neurons and layer-wise balanced output control aiming to constrain the inner neurons in each layer. We evaluate NeuGuard and compare it with state-of-the-art defenses against two neural network based MIAs, five strongest metric based MIAs including the newly proposed label-only MIA on three benchmark datasets. Results show that NeuGuard outperforms the state-of-the-art defenses by offering much improved utility-privacy trade-off, generality, and overhead.
翻译:对机器学习模型的归属攻击(MIAs)可能导致模型培训中所用培训数据集的严重隐私风险。 在本文中,我们提议了名为NeuGuard的新型和有效中导防御方法,名为NeuGuard,以对抗会籍推断攻击(MIAs)。我们发现现有针对MIA的防御机制存在一个关键弱点,在这种机制中,它们无法同时防御两个常用的神经网络以机器学习模型为基础的神经系统,表明这两起攻击应该分开评估,以确保防御效力。我们提议了NeuGuard,这是一种新的防御方法,即联合控制产出和内神经的激活,同时控制培训数据集的模型输出和测试组的近距离分布。Neuguard包含一种等级差异最小化方法,目标是限制最终输出神经元和分层平衡的输出控制,以限制每一层的内神经元。我们评估NeuGuard,并将其与基于MIA的两种神经网络的最先进的防御系统进行对比。基于MIA的五种最强的公制测量,包括三个基准数据集上提议的仅贴标签的MIA。结果显示Neguard-prival-preval-preval-prevabal-preval-pal-palfrafal-palforvaste-pal-pal-pal-foration-pal-pal-pal-pal-pal-pal-pal-foration-foration-s。