Machine unlearning, i.e. having a model forget about some of its training data, has become increasingly more important as privacy legislation promotes variants of the right-to-be-forgotten. In the context of deep learning, approaches for machine unlearning are broadly categorized into two classes: exact unlearning methods, where an entity has formally removed the data point's impact on the model by retraining the model from scratch, and approximate unlearning, where an entity approximates the model parameters one would obtain by exact unlearning to save on compute costs. In this paper, we first show that the definition that underlies approximate unlearning, which seeks to prove the approximately unlearned model is close to an exactly retrained model, is incorrect because one can obtain the same model using different datasets. Thus one could unlearn without modifying the model at all. We then turn to exact unlearning approaches and ask how to verify their claims of unlearning. Our results show that even for a given training trajectory one cannot formally prove the absence of certain data points used during training. We thus conclude that unlearning is only well-defined at the algorithmic level, where an entity's only possible auditable claim to unlearning is that they used a particular algorithm designed to allow for external scrutiny during an audit.
翻译:机器不学习,即一个模型忘记了它的一些培训数据,已经变得日益重要,因为隐私立法促进被遗忘权利的变异。在深层学习方面,机器不学习的方法被广泛分为两类:精确的不学习方法,一个实体通过从零开始再培训模型,正式消除数据点对模型的影响,以及粗略的不学习,一个实体通过精确的不学习以节省计算费用而接近模型参数。在本文中,我们首先表明,作为粗略学习基础的粗略未学习基础的定义,即试图证明大约未学习模式接近于完全再培训的模式,是不正确的,因为一个人可以使用不同的数据集获得相同的模型。因此,一个实体可以在不修改模型的情况下,将无法阅读数据点对模型的影响。然后我们转而采用精确的不学习方法,并询问如何核实其未学习的主张。我们的结果显示,即使对某个特定的培训轨迹来说,一个人也无法正式证明在培训中使用的某些数据点的缺失。因此我们的结论是,在算法层面上,没有学习只是很好地定义了它们,在那里,一个实体只能用来进行外部审查。