Machine learning models (mainly neural networks) are used more and more in real life. Users feed their data to the model for training. But these processes are often one-way. Once trained, the model remembers the data. Even when data is removed from the dataset, the effects of these data persist in the model. With more and more laws and regulations around the world protecting data privacy, it becomes even more important to make models forget this data completely through machine unlearning. This paper adopts the projection residual method based on Newton iteration method. The main purpose is to implement machine unlearning tasks in the context of linear regression models and neural network models. This method mainly uses the iterative weighting method to completely forget the data and its corresponding influence, and its computational cost is linear in the feature dimension of the data. This method can improve the current machine learning method. At the same time, it is independent of the size of the training set. Results were evaluated by feature injection testing (FIT). Experiments show that this method is more thorough in deleting data, which is close to model retraining.
翻译:实际生活中越来越多地使用机器学习模型(主要是神经网络),用户将数据输入培训模型,但这些过程往往是单向的。一旦经过培训,模型就会记住数据。即使数据从数据集中移除,这些数据的效果也持续在模型中。随着世界各地越来越多的法律和法规保护数据隐私,使模型通过机器退出学习而完全忘记这些数据变得更加重要。本文件采用了基于牛顿迭代法的预测剩余方法。主要目的是在线性回归模型和神经网络模型中执行机器不学习任务。这种方法主要使用迭代加权法完全忘记数据及其相应影响,其计算成本在数据特征方面是线性。这种方法可以改进目前的机器学习方法。与此同时,它独立于培训成套方法的规模。结果通过特征注射测试(FIT)来评估。实验表明,在删除数据方面,这种方法更彻底,而模型再培训接近于此。