With the growing adoption of data privacy regulations, the ability to erase private or copyrighted information from trained models has become a crucial requirement. Traditional unlearning methods often assume access to the complete training dataset, which is unrealistic in scenarios where the source data is no longer available. To address this challenge, we propose a certified unlearning framework that enables effective data removal \final{without access to the original training data samples}. Our approach utilizes a surrogate dataset that approximates the statistical properties of the source data, allowing for controlled noise scaling based on the statistical distance between the two. \updated{While our theoretical guarantees assume knowledge of the exact statistical distance, practical implementations typically approximate this distance, resulting in potentially weaker but still meaningful privacy guarantees.} This ensures strong guarantees on the model's behavior post-unlearning while maintaining its overall utility. We establish theoretical bounds, introduce practical noise calibration techniques, and validate our method through extensive experiments on both synthetic and real-world datasets. The results demonstrate the effectiveness and reliability of our approach in privacy-sensitive settings.
翻译:随着数据隐私法规的日益普及,从已训练模型中擦除私有或受版权保护信息的能力已成为一项关键需求。传统的遗忘方法通常假设能够访问完整的训练数据集,这在源数据不再可用的场景中是不现实的。为应对这一挑战,我们提出了一种认证遗忘框架,该框架能够实现有效的数据移除,而无需访问原始训练数据样本。我们的方法利用一个替代数据集来近似源数据的统计特性,从而允许基于两者之间的统计距离进行可控的噪声缩放。虽然我们的理论保证假设已知精确的统计距离,但实际实现通常近似估计该距离,这可能导致隐私保证有所减弱,但仍然具有实际意义。这确保了模型在遗忘后行为具有强有力的保证,同时保持其整体效用。我们建立了理论界限,引入了实用的噪声校准技术,并通过在合成和真实世界数据集上的大量实验验证了我们的方法。结果证明了我们的方法在隐私敏感环境中的有效性和可靠性。