We study the problem of estimating non-linear functionals of discrete distributions in the context of local differential privacy. The initial data $x_1,\ldots,x_n \in [K]$ are supposed i.i.d. and distributed according to an unknown discrete distribution $p = (p_1,\ldots,p_K)$. Only $\alpha$-locally differentially private (LDP) samples $z_1,...,z_n$ are publicly available, where the term 'local' means that each $z_i$ is produced using one individual attribute $x_i$. We exhibit privacy mechanisms (PM) that are interactive (i.e. they are allowed to use already published confidential data) or non-interactive. We describe the behavior of the quadratic risk for estimating the power sum functional $F_{\gamma} = \sum_{k=1}^K p_k^{\gamma}$, $\gamma >0$ as a function of $K, \, n$ and $\alpha$. In the non-interactive case, we study two plug-in type estimators of $F_{\gamma}$, for all $\gamma >0$, that are similar to the MLE analyzed by Jiao et al. (2017) in the multinomial model. However, due to the privacy constraint the rates we attain are slower and similar to those obtained in the Gaussian model by Collier et al. (2020). In the interactive case, we introduce for all $\gamma >1$ a two-step procedure which attains the faster parametric rate $(n \alpha^2)^{-1/2}$ when $\gamma \geq 2$. We give lower bounds results over all $\alpha$-LDP mechanisms and all estimators using the private samples.
翻译:我们研究在当地差异隐私背景下估算离散分布的非线性功能的问题。 初始数据 $x_ 1,\ ldots, x_n\ in [ K]$, 最初数据应该是 i. d. 并按未知的离散分配 $p = (p_ 1,\ldots, p_ K) 分配。 只有 alpha$- 本地差异私人样本 $z_ 1,... z_ n$ 可以公开使用, 其中“ 本地” 表示每个 $_ i 美元使用一个个人属性$x_ 美元。 我们展示的是互动( 即允许使用已经公布的机密数据) $qp = (p_ 1,\ ldotot, p_ k) 。 美元= calphal- gammat 样本样本 = = 1, 美元 美元 = 美元, 美元= 美元, 美元= 美元=0 美元, 美元= 美元= 美元, 美元= 美元, 美元= 美元=