We give the first polynomial-time algorithm to estimate the mean of a $d$-variate probability distribution with bounded covariance from $\tilde{O}(d)$ independent samples subject to pure differential privacy. Prior algorithms for this problem either incur exponential running time, require $\Omega(d^{1.5})$ samples, or satisfy only the weaker concentrated or approximate differential privacy conditions. In particular, all prior polynomial-time algorithms require $d^{1+\Omega(1)}$ samples to guarantee small privacy loss with "cryptographically" high probability, $1-2^{-d^{\Omega(1)}}$, while our algorithm retains $\tilde{O}(d)$ sample complexity even in this stringent setting. Our main technique is a new approach to use the powerful Sum of Squares method (SoS) to design differentially private algorithms. SoS proofs to algorithms is a key theme in numerous recent works in high-dimensional algorithmic statistics -- estimators which apparently require exponential running time but whose analysis can be captured by low-degree Sum of Squares proofs can be automatically turned into polynomial-time algorithms with the same provable guarantees. We demonstrate a similar proofs to private algorithms phenomenon: instances of the workhorse exponential mechanism which apparently require exponential time but which can be analyzed with low-degree SoS proofs can be automatically turned into polynomial-time differentially private algorithms. We prove a meta-theorem capturing this phenomenon, which we expect to be of broad use in private algorithm design. Our techniques also draw new connections between differentially private and robust statistics in high dimensions. In particular, viewed through our proofs-to-private-algorithms lens, several well-studied SoS proofs from recent works in algorithmic robust statistics directly yield key components of our differentially private mean estimation algorithm.
翻译:我们给出第一个多元时间算法, 来估算美元差概率分布的平均值, 以来自 $\ tilde{ O} (d) 的独立样本中受纯差异隐私限制的封闭性共差分配。 这个问题的先前算法要么产生指数运行时间, 需要$\ Omega( d ⁇ 1.5}) 样本, 要么只满足较弱的集中性或近似差异性隐私条件。 特别是, 之前所有多时算法都需要$d ⁇ 1\\ ⁇ ⁇ Omega(1)} 样本, 以保证小额隐私损失的最小值分配率, 由 美元差1-2 ⁇ - d ⁇ Omega(1) =$, 而我们的算法即使在这个严格的环境下, 也保留了 $\ talde{d) 样本复杂性。 我们的主要技术是一种新方法, 使用强大的平方位计算法方法来设计差别化的私人算法。 因此, 算法的证据是最近许多工作的一个不同主题, 高度计算法的精确性统计学数据显然需要指数运行时间, 但其分析可以通过低度的精确度 。 我们的快速的快速算算算算算法, 也可以自动地算算算法 。