Neural networks pose a privacy risk due to their propensity to memorise and leak training data. We show that unique features occurring only once in training data are memorised by discriminative multi-layer perceptrons and convolutional neural networks trained on benchmark imaging datasets. We design our method for settings where sensitive training data is not available, for example medical imaging. Our setting knows the unique feature, but not the training data, model weights or the unique feature's label. We develop a score estimating a model's sensitivity to a unique feature by comparing the KL divergences of the model's output distributions given modified out-of-distribution images. We find that typical strategies to prevent overfitting do not prevent unique feature memorisation. And that images containing a unique feature are highly influential, regardless of the influence the images's other features. We also find a significant variation in memorisation with training seed. These results imply that neural networks pose a privacy risk to rarely occurring private information. This risk is more pronounced in healthcare applications since sensitive patient information can be memorised when it remains in training data due to an imperfect data sanitisation process.
翻译:神经网络由于倾向于回忆和泄漏培训数据而构成隐私风险。 我们显示,在培训数据中,只有一次出现独特的特征,才有歧视性的多层感官和在基准成像数据集方面受过训练的进化神经网络。 我们设计了用于没有敏感培训数据的环境的方法,例如医学成像。 我们的设置了解独特的特征, 而不是培训数据、 模型重量或独特特性标签。 我们通过比较模型输出分布的KL差异, 对模型的独特性进行了评估。 我们发现, 典型的防止过度匹配的战略并不妨碍独特的特征的记忆化。 包含独特特征的图像具有很高的影响力, 不论图像的其他特性有何影响 。 我们还发现, 与培训种子的记忆化有很大差异 。 这些结果意味着神经网络对很少发生私人信息构成隐私风险 。 这种风险在医疗保健应用中更为明显, 因为敏感的病人信息在由于数据不完善化而培训数据时可以被记住。