忘记忘忘学习:在机器学习中实现真正的数据删除 (Forget Unlearning: Towards True Data-Deletion in Machine Learning)

Unlearning algorithms aim to remove deleted data's influence from trained models at a cost lower than full retraining. However, prior guarantees of unlearning in literature are flawed and don't protect the privacy of deleted records. We show that when users delete their data as a function of published models, records in a database become interdependent. So, even retraining a fresh model after deletion of a record doesn't ensure its privacy. Secondly, unlearning algorithms that cache partial computations to speed up the processing can leak deleted information over a series of releases, violating the privacy of deleted records in the long run. To address these, we propose a sound deletion guarantee and show that the privacy of existing records is necessary for the privacy of deleted records. Under this notion, we propose an accurate, computationally efficient, and secure machine unlearning algorithm based on noisy gradient descent.

翻译：取消学习算法的目的是以低于全面再培训的成本从经过培训的模型中消除被删除的数据的影响。但是,在文献中不学习的事先保障存在缺陷,不保护被删除记录的隐私。我们表明,当用户删除数据时,作为已公布模式的功能,数据库中的记录就变得相互依存。因此,即使删除记录后再培训一个新的模型也不能保证其隐私。其次,为加快处理速度而隐藏部分计算的非学习算法,可以在一系列发布中泄漏被删除的信息,从长远看,侵犯被删除记录的隐私。为了解决这些问题,我们提议了一种健全的删除保证,并表明现有记录的隐私对于被删除的记录的隐私是必要的。根据这一概念,我们提议了一种精确、计算高效和安全的机器不学习算法,其基础是噪音梯度下降。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日