Machine unlearning is a process of removing the impact of some training data from the machine learning (ML) models upon receiving removal requests. While straightforward and legitimate, retraining the ML model from scratch incurs a high computational overhead. To address this issue, a number of approximate algorithms have been proposed in the domain of image and text data, among which SISA is the state-of-the-art solution. It randomly partitions the training set into multiple shards and trains a constituent model for each shard. However, directly applying SISA to the graph data can severely damage the graph structural information, and thereby the resulting ML model utility. In this paper, we propose GraphEraser, a novel machine unlearning framework tailored to graph data. Its contributions include two novel graph partition algorithms and a learning-based aggregation method. We conduct extensive experiments on five real-world graph datasets to illustrate the unlearning efficiency and model utility of GraphEraser. It achieves 2.06$\times$ (small dataset) to 35.94$\times$ (large dataset) unlearning time improvement. On the other hand, GraphEraser achieves up to $62.5\%$ higher F1 score and our proposed learning-based aggregation method achieves up to $112\%$ higher F1 score.\footnote{Our code is available at \url{https://github.com/MinChen00/Graph-Unlearning}.}
翻译:机器不学习是消除机器学习(ML)模式中某些培训数据在接收清除请求时从机器学习(ML)模式中消失影响的一个过程。 直接应用 SISA 到图形结构信息, 从而严重损坏图表结构信息, 并因此导致 ML 模型的功能。 在本文中, 我们提议“ 图形”, 是一个针对图形数据的新机器不学习框架。 为了解决这个问题, 在图像和文本数据领域提出了若干近似算法, 其中包括SISA 是最新解决方案。 它随机将培训设置分成多块块, 为每个碎片培训一个构成模型。 但是, 直接应用 SISA 到图形数据, 可能会严重损害图形结构信息, 从而导致 ML 模型的功能。 我们在此文件中, 我们建议为图表数据设计一个全新的机器不学习框架。 它的贡献包括两个新的图形分区配置算法和基于学习的汇总方法。 我们在五个真实世界图形数据集上进行广泛的实验, 说明未学习效率和模型的效用。 它达到2.06\time$( 小数据) 到35美元(大数据设置) 不学习时间值) 。 在另一手上, GreaEras_\\\\ 开始我们的排名的排名的排名的排名的排名。