Unlearning algorithms aim to remove deleted data's influence from trained models at a cost lower than full retraining. However, prior guarantees of unlearning in literature are flawed and don't protect the privacy of deleted records. We show that when users delete their data as a function of published models, records in a database become interdependent. So, even retraining a fresh model after deletion of a record doesn't ensure its privacy. Secondly, unlearning algorithms that cache partial computations to speed up the processing can leak deleted information over a series of releases, violating the privacy of deleted records in the long run. To address these, we propose a sound deletion guarantee and show that the privacy of existing records is necessary for the privacy of deleted records. Under this notion, we propose an accurate, computationally efficient, and secure machine unlearning algorithm based on noisy gradient descent.
翻译:取消学习算法的目的是以低于全面再培训的成本从经过培训的模型中消除被删除的数据的影响。 但是,在文献中不学习的事先保障存在缺陷,不保护被删除记录的隐私。 我们表明,当用户删除数据时,作为已公布模式的功能,数据库中的记录就变得相互依存。 因此,即使删除记录后再培训一个新的模型也不能保证其隐私。 其次,为加快处理速度而隐藏部分计算的非学习算法,可以在一系列发布中泄漏被删除的信息,从长远看,侵犯被删除记录的隐私。 为了解决这些问题,我们提议了一种健全的删除保证,并表明现有记录的隐私对于被删除的记录的隐私是必要的。 根据这一概念,我们提议了一种精确、计算高效和安全的机器不学习算法,其基础是噪音梯度下降。