Unlearning has emerged as a technique to efficiently erase information of deleted records from learned models. We show, however, that the influence created by the original presence of a data point in the training set can still be detected after running certified unlearning algorithms (which can result in its reconstruction by an adversary). Thus, under realistic assumptions about the dynamics of model releases over time and in the presence of adaptive adversaries, we show that unlearning is not equivalent to data deletion and does not guarantee the "right to be forgotten." We then propose a more robust data-deletion guarantee and show that it is necessary to satisfy differential privacy to ensure true data deletion. Under our notion, we propose an accurate, computationally efficient, and secure data-deletion machine learning algorithm in the online setting based on noisy gradient descent algorithm.
翻译:不学习已成为有效删除从已学的模型中删除记录的信息的一种方法。 然而,我们表明,最初在培训组中存在一个数据点所造成的影响,仍然可以在运行经认证的不学习算法(这可能导致对手的重建 ) 后被检测出来。 因此,根据对模型释放动态的现实假设,在一段时间内,当有适应性对手在场时,我们表明,不学习并不等同于数据删除,也不能保证“被遗忘的权利 ” 。 我们然后提出一个更强有力的数据删除保证,并表明有必要满足差异隐私以确保真正删除数据。 根据我们的概念,我们建议基于噪音梯度梯度位算法的在线环境中的准确、计算高效和安全的数据删除机器学习算法。