In this work, we study high-dimensional mean estimation under user-level differential privacy, and design an $(\varepsilon,\delta)$-differentially private mechanism using as few users as possible. In particular, we provide a nearly optimal trade-off between the number of users and the number of samples per user required for private mean estimation, even when the number of users is as low as $O(\frac{1}{\varepsilon}\log\frac{1}{\delta})$. Interestingly, this bound on the number of \emph{users} is independent of the dimension (though the number of \emph{samples per user} is allowed to depend polynomially on the dimension), unlike the previous work that requires the number of users to depend polynomially on the dimension. This resolves a problem first proposed by Amin et al. Moreover, our mechanism is robust against corruptions in up to $49\%$ of the users. Finally, our results also apply to optimal algorithms for privately learning discrete distributions with few users, answering a question of Liu et al., and a broader range of problems such as stochastic convex optimization and a variant of stochastic gradient descent via a reduction to differentially private mean estimation.
翻译:在这项工作中,我们根据用户的差别隐私度研究高维平均值估算值,并使用尽可能少的用户来设计一个以美元(varepsilon,\delta)为单位的专用机制。特别是,我们提供了用户数量和每个用户为私人平均估算所需的样本数量之间几乎最佳的权衡,即使用户数量低到1美元(1美元/varepsilónçlog\frac{1undelta}1美元。有趣的是,这取决于 emph{user} 的数量是独立于这一层面的(尽管允许每个用户的\emph{samples/ perus} 的数量在规模上多取用,而不同于以往要求用户数量多取多用该层面的样本数量的工作。这解决了Amin等人首先提出的一个问题。此外,我们的机制在防止用户腐败方面达到了49美元/%美元。最后,我们的结果还适用于与少数用户私下学习离散分配的最佳算法,通过普通的变位变位问题,通过普通的变位变位问题,通过普通变位问题等更广泛的问题。