Differential Privacy (DP) is the de facto standard for reasoning about the privacy guarantees of a training algorithm. Despite the empirical observation that DP reduces the vulnerability of models to existing membership inference (MI) attacks, a theoretical underpinning as to why this is the case is largely missing in the literature. In practice, this means that models need to be trained with DP guarantees that greatly decrease their accuracy. In this paper, we provide a tighter bound on the positive accuracy (i.e., attack precision) of any MI adversary when a training algorithm provides $\epsilon$-DP or $(\epsilon, \delta)$-DP. Our bound informs the design of a novel privacy amplification scheme, where an effective training set is sub-sampled from a larger set prior to the beginning of training, to greatly reduce the bound on MI accuracy. As a result, our scheme enables DP users to employ looser DP guarantees when training their model to limit the success of any MI adversary; this ensures that the model's accuracy is less impacted by the privacy guarantee. Finally, we discuss implications of our MI bound on the field of machine unlearning.
翻译:差异隐私(DP)是有关培训算法隐私保障的事实上的推理标准。 尽管经验性观察认为DP降低了模型对现有成员推论(MI)攻击的脆弱性, 但文献中基本上缺乏这种可能性的理论基础。 实际上,这意味着模型需要接受DP保障的培训,从而大大降低其准确性。 在本文中,当培训算法提供美元-DP或美元(\epsilon,\delta)-DP时,我们对任何MI对手的积极准确性(即攻击精确性)进行更严格的约束。 我们的定律为设计一个新的隐私扩增计划,在该计划中,有效的培训集从培训开始之前的较大型组合中进行,以大大降低对MI准确性的约束。结果,我们的计划使DP用户在培训模型限制任何MI对手的成功时能够使用较松的DP保证;这确保模型的准确性不会受到隐私保障的影响。 最后,我们讨论了我们的MI约束对机器不学习领域的影响。