Linear sketches have been widely adopted to process fast data streams, and they can be used to accurately answer frequency estimation, approximate top K items, and summarize data distributions. When data are sensitive, it is desirable to provide privacy guarantees for linear sketches to preserve private information while delivering useful results with theoretical bounds. We show that linear sketches can ensure privacy and maintain their unique properties with a small amount of noise added at initialization. From the differentially private linear sketches, we showcase that the state-of-the-art quantile sketch in the turnstile model can also be private and maintain high performance. Experiments further demonstrate that our proposed differentially private sketches are quantitatively and qualitatively similar to noise-free sketches with high utilization on synthetic and real datasets.
翻译:线性草图被广泛用于处理快速数据流,可用于准确回答频率估计、近似顶K项目和总结数据分布。当数据敏感时,可取的做法是为线性草图提供隐私保障,以保存私人信息,同时以理论界限提供有益的结果。我们显示线性草图可以确保隐私并保持其独特的特性,在初始化时增加少量噪音。从差别化的私人线性草图中,我们展示的是,旋转模型中最先进的孔性草图也可以是私人的,并且保持高性能。实验进一步表明,我们提议的有差别的私人草图在数量和质量上与无噪音草图相似,在合成和真实数据集中使用率很高。