Sketching is an important tool for dealing with high-dimensional vectors that are sparse (or well-approximated by a sparse vector), especially useful in distributed, parallel, and streaming settings. It is known that sketches can be made differentially private by adding noise according to the sensitivity of the sketch, and this has been used in private analytics and federated learning settings. The post-processing property of differential privacy implies that all estimates computed from the sketch can be released within the given privacy budget. In this paper we consider the classical CountSketch, made differentially private with the Gaussian mechanism, and give an improved analysis of its estimation error. Perhaps surprisingly, the privacy-utility trade-off is essentially the best one could hope for, independent of the number of repetitions in CountSketch: The error is almost identical to the error from non-private CountSketch plus the noise needed to make the vector private in the original, high-dimensional domain.
翻译:切除是处理稀有(或与稀有矢量相近)的高维矢量的重要工具,在分布、平行和流流设置中特别有用。众所周知,草图可以根据草图的敏感性添加噪音,从而以不同的方式私下制作,这在私人分析和联合学习环境中已经使用。 不同隐私的后处理属性意味着根据草图计算的所有估计数都可以在特定隐私预算内发布。 在本文中,我们考虑古典伯爵Sketch, 与高斯安机制有差异的私密性, 并对其估计错误进行了更好的分析。 也许令人惊讶的是, 隐私效用交易基本上是人们所希望的最好办法, 与高斯切特的重复次数无关: 错误几乎与非私人伯爵Sketch的错误以及使矢量成为原始高维域的私隐性所需的噪音完全相同。