Nowadays, machine learning models, especially neural networks, become prevalent in many real-world applications.These models are trained based on a one-way trip from user data: as long as users contribute their data, there is no way to withdraw; and it is well-known that a neural network memorizes its training data. This contradicts the "right to be forgotten" clause of GDPR, potentially leading to law violations. To this end, machine unlearning becomes a popular research topic, which allows users to eliminate memorization of their private data from a trained machine learning model.In this paper, we propose the first uniform metric called for-getting rate to measure the effectiveness of a machine unlearning method. It is based on the concept of membership inference and describes the transformation rate of the eliminated data from "memorized" to "unknown" after conducting unlearning. We also propose a novel unlearning method calledForsaken. It is superior to previous work in either utility or efficiency (when achieving the same forgetting rate). We benchmark Forsaken with eight standard datasets to evaluate its performance. The experimental results show that it can achieve more than 90\% forgetting rate on average and only causeless than 5\% accuracy loss.
翻译:目前,机器学习模式,特别是神经网络,在许多现实世界应用中变得很普遍。这些模式是在用户数据单程旅行的基础上培训的:只要用户提供数据,就没有办法撤回数据;众所周知,神经网络回忆着其培训数据。这与GDPR的“被遗忘的权利”条款相矛盾,可能导致违法。为此,机器不学习成为一个受欢迎的研究课题,使用户能够从经过训练的机器学习模式中消除其私人数据的记忆化。在本文中,我们建议了第一个统一的衡量标准,要求采用设定率来衡量机器不学习方法的有效性。它基于成员推论的概念,并描述了从“模拟”到“未知”的删除数据转换率。我们还提出了一种叫Forsaken的新颖的不学习方法。它优于以前的实用性或效率(当达到同样的遗忘率时)工作。我们用八个标准数据集作为基准,用来衡量机器不学习方法的效能。它以成员推论为基础,描述了从“模拟”到“未知”进行学习后“未知”的数据的转换率。我们还提出了一种叫Forsakeen的不学习方法。它比平均损失率要高得多。