Graph neural networks (GNNs) are increasingly used to model complex patterns in graph-structured data. However, enabling them to "forget" designated information remains challenging, especially under privacy regulations such as the GDPR. Existing unlearning methods largely optimize for efficiency and scalability, yet they offer little transparency, and the black-box nature of GNNs makes it difficult to verify whether forgetting has truly occurred. We propose an explainability-driven verifier for GNN unlearning that snapshots the model before and after deletion, using attribution shifts and localized structural changes (for example, graph edit distance) as transparent evidence. The verifier uses five explainability metrics: residual attribution, heatmap shift, explainability score deviation, graph edit distance, and a diagnostic graph rule shift. We evaluate two backbones (GCN, GAT) and four unlearning strategies (Retrain, GraphEditor, GNNDelete, IDEA) across five benchmarks (Cora, Citeseer, Pubmed, Coauthor-CS, Coauthor-Physics). Results show that Retrain and GNNDelete achieve near-complete forgetting, GraphEditor provides partial erasure, and IDEA leaves residual signals. These explanation deltas provide the primary, human-readable evidence of forgetting; we also report membership-inference ROC-AUC as a complementary, graph-wide privacy signal.
翻译:图神经网络(GNNs)日益广泛地用于建模图结构数据中的复杂模式。然而,使其能够“遗忘”指定信息仍具挑战性,尤其是在GDPR等隐私法规的约束下。现有的去学习方法主要优化效率和可扩展性,但缺乏透明度,且GNN的黑箱特性使得难以验证遗忘是否真实发生。我们提出一种基于可解释性的GNN去学习验证器,该验证器在删除前后对模型进行快照,利用归因偏移和局部结构变化(如图编辑距离)作为透明证据。验证器采用五项可解释性指标:残余归因、热图偏移、可解释性分数偏差、图编辑距离以及诊断图规则偏移。我们在五个基准数据集(Cora、Citeseer、Pubmed、Coauthor-CS、Coauthor-Physics)上评估了两种骨干网络(GCN、GAT)和四种去学习策略(Retrain、GraphEditor、GNNDelete、IDEA)。结果表明,Retrain和GNNDelete实现近乎完全的遗忘,GraphEditor提供部分擦除,而IDEA则残留信号。这些解释性差异为遗忘提供了主要的人类可读证据;我们还报告了成员推断ROC-AUC作为补充的图级隐私信号。