Users of a recommender system may want part of their data being deleted, not only from the data repository but also from the underlying machine learning model, for privacy or utility reasons. Such right-to-be-forgotten requests could be fulfilled by simply retraining the recommendation model from scratch, but that would be too slow and too expensive in practice. In this paper, we investigate fast machine unlearning techniques for recommender systems that can remove the effect of a small amount of training data from the recommendation model without incurring the full cost of retraining. A natural idea to speed this process up is to fine-tune the current recommendation model on the remaining training data instead of starting from a random initialization. This warm-start strategy indeed works for neural recommendation models using standard 1st-order neural network optimizers (like AdamW). However, we have found that even greater acceleration could be achieved by employing 2nd-order (Newton or quasi-Newton) optimization methods instead. To overcome the prohibitively high computational cost of 2nd-order optimizers, we propose a new recommendation unlearning approach AltEraser which divides the optimization problem of unlearning into many small tractable sub-problems. Extensive experiments on three real-world recommendation datasets show promising results of AltEraser in terms of consistency (forgetting thoroughness), accuracy (recommendation effectiveness), and efficiency (unlearning speed). To our knowledge, this work represents the first attempt at fast approximate machine unlearning for state-of-the-art neural recommendation models.
翻译:推荐人系统的用户可能希望他们的数据的一部分被删除,不仅从数据存储处删除,而且从基本的机器学习模式中删除,以隐私或实用性为由。这种被遗忘的权利要求可以通过仅仅从零开始对建议模式进行再培训来满足,但实际上这样做太慢和太贵。在本文件中,我们调查推荐人系统快速机不学习的技术,这些技术可以消除建议模式中少量培训数据的影响,而不必承担全部再培训费用。一个加快这一进程的自然想法是微调目前关于剩余培训数据的建议模式,而不是随机初始化。这种热启动战略的确可以使用标准1-顺序神经网络优化(如亚当W)来运行神经建议模式。然而,我们发现,使用2顺序(纽顿或准纽顿)优化方法可以实现更大的加速。要克服2级优化人的计算成本过高,我们首先提出一个新的建议不学习 Altellaer方法,将最优化的精度的精度错误的精确度分为不精确度、不精确度的精确度、不精确度的精确度三度,在不精确度上显示不精确度的模型的精确度上,以最精确性的方法展示我们最接近的精确性的建议。