Data deletion algorithms aim to remove the influence of deleted data points from trained models at a cheaper computational cost than fully retraining those models. However, for sequences of deletions, most prior work in the non-convex setting gives valid guarantees only for sequences that are chosen independently of the models that are published. If people choose to delete their data as a function of the published models (because they don't like what the models reveal about them, for example), then the update sequence is adaptive. In this paper, we give a general reduction from deletion guarantees against adaptive sequences to deletion guarantees against non-adaptive sequences, using differential privacy and its connection to max information. Combined with ideas from prior work which give guarantees for non-adaptive deletion sequences, this leads to extremely flexible algorithms able to handle arbitrary model classes and training methodologies, giving strong provable deletion guarantees for adaptive deletion sequences. We show in theory how prior work for non-convex models fails against adaptive deletion sequences, and use this intuition to design a practical attack against the SISA algorithm of Bourtoule et al. [2021] on CIFAR-10, MNIST, Fashion-MNIST.
翻译:数据删除算法旨在以更廉价的计算成本来消除经过培训的模型中被删除的数据点的影响,而不是完全再培训这些模型。然而,对于删除顺序,大多数以前在非康维克斯设置中的工作仅对独立于已公布的模型而选定的序列提供了有效的保证。如果人们选择删除数据作为已公布的模型的函数(因为他们不喜欢模型所揭示的数据),那么更新顺序是适应性的。在本文中,我们从删除针对适应序列的保证到删除针对非适应序列的保证,用不同的隐私及其与最大信息的联系。这与先前工作的想法相结合,为非适应性删除序列提供了保证。这导致极为灵活的算法能够处理任意的模型课程和培训方法,为适应性删除序列提供强有力的可被批准的删除保证。我们从理论上展示了非康维克斯模型先前的工作如何在反对适应性删除序列方面失败,并利用这种直觉来设计针对CIFAR-10、FASIMIS-MISMIS、FAS-MISMIS、FASimon-MIS、FASIMIS-INAR[202221]的SIS-MIS等SIS的SIS算算算算的对非适应性序列的实际攻击。