A key tool for building differentially private systems is adding Gaussian noise to the output of a function evaluated on a sensitive dataset. Unfortunately, using a continuous distribution presents several practical challenges. First and foremost, finite computers cannot exactly represent samples from continuous distributions, and previous work has demonstrated that seemingly innocuous numerical errors can entirely destroy privacy. Moreover, when the underlying data is itself discrete (e.g., population counts), adding continuous noise makes the result less interpretable. With these shortcomings in mind, we introduce and analyze the discrete Gaussian in the context of differential privacy. Specifically, we theoretically and experimentally show that adding discrete Gaussian noise provides essentially the same privacy and accuracy guarantees as the addition of continuous Gaussian noise. We also present an simple and efficient algorithm for exact sampling from this distribution. This demonstrates its applicability for privately answering counting queries, or more generally, low-sensitivity integer-valued queries.
翻译:建立有区别的私有系统的关键工具是将高斯噪音添加到敏感数据集中被评估的函数输出中。 不幸的是,使用连续分布带来若干实际挑战。 首先,有限计算机不能准确地代表连续分布的样本,而先前的工作已经表明,似乎无关紧要的数字错误可以完全摧毁隐私。此外,当基础数据本身是离散的(例如人口计数)时,增加连续噪音会降低结果的可解释性。有这些缺陷,我们引入并分析离散的高西亚语在有差异的隐私背景下。具体地说,我们在理论上和实验上都表明,增加离散高斯语的噪音提供了基本上相同的隐私和准确性保障,作为连续的高斯语噪音的添加。我们还提供了一种简单而有效的算法,用于从这种分布中进行精确的抽样。这显示了它适用于私下回答询问,或者更一般地说,低敏度的整值查询。