Differential Privacy (DP) is the de facto standard for reasoning about the privacy guarantees of a training algorithm. Despite the empirical observation that DP reduces the vulnerability of models to existing membership inference (MI) attacks, a theoretical underpinning as to why this is the case is largely missing in the literature. In practice, this means that models need to be trained with DP guarantees that greatly decrease their accuracy. In this paper, we provide a tighter bound on the accuracy of any MI adversary when a training algorithm provides $\epsilon$-DP. Our bound informs the design of a novel privacy amplification scheme, where an effective training set is sub-sampled from a larger set prior to the beginning of training, to greatly reduce the bound on MI accuracy. As a result, our scheme enables $\epsilon$-DP users to employ looser DP guarantees when training their model to limit the success of any MI adversary; this ensures that the model's accuracy is less impacted by the privacy guarantee. Finally, we discuss implications of our MI bound on the field of machine unlearning.
翻译:差异隐私(DP)是有关培训算法隐私保障的事实上的推理标准。 尽管经验性观察认为DP降低了模型对现有成员推论(MI)攻击的脆弱性,但文献中却基本上缺乏这种可能性的理论基础。 在实践中,这意味着模型需要接受DP保障的培训,从而大大降低其准确性。 在本文中,当培训算法提供$\epsilon$-DP时,我们对任何MI对手的准确性提供更严格的约束。 我们有义务通报一个新的隐私推理计划的设计,在该计划中,从培训开始前的较大型组合中分出一套有效的培训,以大大降低对MI准确性的约束。 结果,我们的计划使美元-DP用户在培训模型时能够使用宽松的DP保证,以限制任何MI对手的成功;这确保模型的准确性不会受到隐私保障的影响。 最后,我们讨论我们的MI在机器不学习领域的影响。