A large body of research has shown that machine learning models are vulnerable to membership inference (MI) attacks that violate the privacy of the participants in the training data. Most MI research focuses on the case of a single standalone model, while production machine-learning platforms often update models over time, on data that often shifts in distribution, giving the attacker more information. This paper proposes new attacks that take advantage of one or more model updates to improve MI. A key part of our approach is to leverage rich information from standalone MI attacks mounted separately against the original and updated models, and to combine this information in specific ways to improve attack effectiveness. We propose a set of combination functions and tuning methods for each, and present both analytical and quantitative justification for various options. Our results on four public datasets show that our attacks are effective at using update information to give the adversary a significant advantage over attacks on standalone models, but also compared to a prior MI attack that takes advantage of model updates in a related machine-unlearning setting. We perform the first measurements of the impact of distribution shift on MI attacks with model updates, and show that a more drastic distribution shift results in significantly higher MI risk than a gradual shift. Our code is available at https://www.github.com/stanleykywu/model-updates.
翻译:大量研究显示,机器学习模式很容易被违反培训数据参与者隐私的会员推断(MI)攻击。大多数MI研究侧重于单一独立模型的情况,而生产机器学习平台往往在时间上更新模型,数据往往在分布上发生变化,给攻击者更多的信息。本文提议利用一个或多个模型更新来改进MI。我们的方法的一个关键部分是利用独立MI攻击的丰富信息,这些攻击分别针对原始和最新模型,并以具体方式将这些信息结合起来,以提高攻击效力。我们为每个模型提出一套组合功能和调整方法,并为各种选项提出分析和定量理由。我们在四个公共数据集上的结果显示,我们的攻击能够有效地利用更新信息,为攻击独立模型提供重要优势,但与以前的MI攻击相比,它利用相关机器学习环境中的模型更新。我们用模型更新对MI攻击的分发变化影响进行首次测量,并显示在大幅提高MI/com风险上进行更急剧的分发变化。我们使用的MI/MRMMM风险比逐步改变。我们使用的代码在MARG/KWA/MRMWA。