We revisit the classical problem of nonparametric density estimation, but impose local differential privacy constraints. Under such constraints, the original multivariate data $X_1,\ldots,X_n \in \mathbb{R}^d$ cannot be directly observed, and all estimators are functions of the randomised output of a suitable privacy mechanism. The statistician is free to choose the form of the privacy mechanism, and in this work we propose to add Laplace distributed noise to a discretisation of the location of a vector $X_i$. Based on these randomised data, we design a novel estimator of the density function, which can be viewed as a privatised version of the well-studied histogram density estimator. Our theoretical results include universal pointwise consistency and strong universal $L_1$-consistency. In addition, a convergence rate over classes of Lipschitz functions is derived, which is complemented by a matching minimax lower bound. We illustrate the trade-off between data utility and privacy by means of a small simulation study.
翻译:我们重新审视了非对称密度估计的经典问题,但提出了本地差异隐私限制。在这样的限制下,无法直接观察原始多变量数据$X_1,\ldots,X_n\in\mathbb{R ⁇ d$,所有估计数据都是适当隐私机制随机输出的功能。统计员可以自由选择隐私机制的形式,在这项工作中,我们提议将拉皮尔分布的噪音添加到矢量 $X_i$ 的离散位置上。根据这些随机数据,我们设计了一个新的密度函数估计器,可以把它视为经过仔细研究的直方密度估计器的精密版本。我们的理论结果包括通用的点一致性和强大的通用$L_1美元一致性。此外,在Libschitz 功能的类别上,还得出了一种趋同率,并辅之以一个匹配的低缩缩缩缩缩缩。我们通过小型模拟研究来说明数据效用与隐私之间的权衡。