While it is tempting to believe that data distillation preserves privacy, distilled data's empirical robustness against known attacks does not imply a provable privacy guarantee. Here, we develop a provably privacy-preserving data distillation algorithm, called differentially private kernel inducing points (DP-KIP). DP-KIP is an instantiation of DP-SGD on kernel ridge regression (KRR). Following a recent work, we use neural tangent kernels and minimize the KRR loss to estimate the distilled datapoints (i.e., kernel inducing points). We provide a computationally efficient JAX implementation of DP-KIP, which we test on several popular image and tabular datasets to show its efficacy in data distillation with differential privacy guarantees.
翻译:虽然人们很愿意相信数据蒸馏能保护隐私,但被蒸馏的数据对已知攻击的经验强度并不意味着可以证明的隐私保障。在这里,我们开发了一种可以想象的隐私保护数据蒸馏算法,称为“有差别的私人内核诱导点 ” ( DP-KIP ) 。 DP-KIP是在内核脊脊回归( KRR)上DP-SGD的即时反应。 在最近的工作之后,我们使用神经切核内核并尽量减少KRR损失来估计蒸馏的数据点( 即内核导点 ) 。 我们提供了一种计算高效的 DP- KIP JAX 实施 DP- KIP 。 我们测试了几个受欢迎的图像和表格数据集, 以显示其在数据蒸馏过程中的效力, 并有不同的隐私保障 。