We present two sample-efficient differentially private mean estimators for $d$-dimensional (sub)Gaussian distributions with unknown covariance. Informally, given $n \gtrsim d/\alpha^2$ samples from such a distribution with mean $\mu$ and covariance $\Sigma$, our estimators output $\tilde\mu$ such that $\| \tilde\mu - \mu \|_{\Sigma} \leq \alpha$, where $\| \cdot \|_{\Sigma}$ is the Mahalanobis distance. All previous estimators with the same guarantee either require strong a priori bounds on the covariance matrix or require $\Omega(d^{3/2})$ samples. Each of our estimators is based on a simple, general approach to designing differentially private mechanisms, but with novel technical steps to make the estimator private and sample-efficient. Our first estimator samples a point with approximately maximum Tukey depth using the exponential mechanism, but restricted to the set of points of large Tukey depth. Proving that this mechanism is private requires a novel analysis. Our second estimator perturbs the empirical mean of the data set with noise calibrated to the empirical covariance, without releasing the covariance itself. Its sample complexity guarantees hold more generally for subgaussian distributions, albeit with a slightly worse dependence on the privacy parameter. For both estimators, careful preprocessing of the data is required to satisfy differential privacy.
翻译:我们用两种样本效率不同的私人平均估计值来计算 $d$ 维( sub) Gausian 分配值, 其距离为 Mahalanobis 距离 。 非正式地, 我们的估算值输出了两种具有不同效率的私人平均估计值 $\ mu$ 和 comma $ 。 我们的估算值输出了 $\ tilde\ mu -\ mu {\\\ sigma}\ leq\ alpha$, 其中, $\\\ cdot ⁇ Sigma}$ 是 Mahalanobis 距离 。 所有具有相同保证的先前的估算值, 要么需要使用 cogtracism 矩阵上的强烈的缩略图边框, 要么需要 $\\ om( d ⁇ 3/2} ) 美元样本。 我们的每个估算值都基于一种简单、 通用的方法来设计差异的私人机制, 但有新的技术步骤步骤使估算器的估算器 。 我们的第一个估测测算器在使用指数深度的精度的精度深度中, 需要一个大型的精确分析。