Membership inference attacks (MIA) try to detect if data samples were used to train a neural network model, e.g. to detect copyright abuses. We show that models with higher dimensional input and output are more vulnerable to MIA, and address in more detail models for image translation and semantic segmentation. We show that reconstruction-errors can lead to very effective MIA attacks as they are indicative of memorization. Unfortunately, reconstruction error alone is less effective at discriminating between non-predictable images used in training and easy to predict images that were never seen before. To overcome this, we propose using a novel predictability score that can be computed for each sample, and its computation does not require a training set. Our membership error, obtained by subtracting the predictability score from the reconstruction error, is shown to achieve high MIA accuracy on an extensive number of benchmarks.
翻译:成员推论攻击(MIA)试图检测数据样本是否用于培训神经网络模型,例如用于检测版权滥用情况。我们表明,具有更高维度投入和产出的模型更容易受到MIA的伤害,更详细地处理图像翻译和语义分割的模型。我们显示,重建-动力可导致非常有效的MIA攻击,因为它们可以表示记忆化。不幸的是,仅重建错误就不太有效地区分培训中使用的非可预见图像和易于预测以前从未见过的图像。为了克服这一点,我们提议使用新颖的可预测性分数,为每个样本计算,其计算不需要培训。我们通过从重建错误中减去可预测性分数而得来的成员资格错误表明,在大量基准上实现了高度的MIA准确性。