Privacy attacks on machine learning models aim to identify the data that is used to train such models. Such attacks, traditionally, are studied on static models that are trained once and are accessible by the adversary. Motivated to meet new legal requirements, many machine learning methods are recently extended to support machine unlearning, i.e., updating models as if certain examples are removed from their training sets, and meet new legal requirements. However, privacy attacks could potentially become more devastating in this new setting, since an attacker could now access both the original model before deletion and the new model after the deletion. In fact, the very act of deletion might make the deleted record more vulnerable to privacy attacks. Inspired by cryptographic definitions and the differential privacy framework, we formally study privacy implications of machine unlearning. We formalize (various forms of) deletion inference and deletion reconstruction attacks, in which the adversary aims to either identify which record is deleted or to reconstruct (perhaps part of) the deleted records. We then present successful deletion inference and reconstruction attacks for a variety of machine learning models and tasks such as classification, regression, and language models. Finally, we show that our attacks would provably be precluded if the schemes satisfy (variants of) Deletion Compliance (Garg, Goldwasser, and Vasudevan, Eurocrypt' 20).
翻译:对机器学习模型的隐私攻击旨在确定用于培训这类模型的数据。传统上,这类攻击是针对一次性培训的静态模型研究的,而且对手可以接触到这些模型。为了满足新的法律要求,最近推广了许多机器学习方法,以支持机器不学习,即更新模型,仿佛某些例子已经从其培训成套材料中删除,并符合新的法律要求。然而,在这一新环境下,隐私攻击有可能变得更具有破坏性,因为攻击者现在可以在删除之前访问原始模型,在删除之后访问新模型。事实上,删除的行为本身会使删除的记录更容易受到隐私攻击。受加密定义和差异隐私框架的启发,我们正式研究了机器不学习的隐私影响。我们正式(以各种形式)删除和删除了重建攻击,在这种攻击中,敌对的目的是确定删除哪些记录或重建(部分)删除的记录。我们随后成功地删除了各种机器学习模型和任务,例如分类、倒退和语言模型等的推断和重建攻击。最后,我们展示了我们的“遵守”计划,如果能够满足,那么“遵守”的20号计划(我们展示了“遵守”计划,如果能够消除,那么,“遵守”的“遵守”计划)。