Most works in learning with differential privacy (DP) have focused on the setting where each user has a single sample. In this work, we consider the setting where each user holds $m$ samples and the privacy protection is enforced at the level of each user's data. We show that, in this setting, we may learn with a much fewer number of users. Specifically, we show that, as long as each user receives sufficiently many samples, we can learn any privately learnable class via an $(\epsilon, \delta)$-DP algorithm using only $O(\log(1/\delta)/\epsilon)$ users. For $\epsilon$-DP algorithms, we show that we can learn using only $O_{\epsilon}(d)$ users even in the local model, where $d$ is the probabilistic representation dimension. In both cases, we show a nearly-matching lower bound on the number of users required. A crucial component of our results is a generalization of global stability [Bun et al., FOCS 2020] that allows the use of public randomness. Under this relaxed notion, we employ a correlated sampling strategy to show that the global stability can be boosted to be arbitrarily close to one, at a polynomial expense in the number of samples.
翻译:大部分在以不同隐私(DP)学习中的工作都集中在每个用户拥有单一样本的设置上。 在这项工作中,我们考虑每个用户拥有一美元样本和隐私保护的设置。 我们显示,在这种设置中,我们可以通过少得多的用户来学习。 具体地说, 我们显示,只要每个用户获得足够多的样本, 我们就可以通过$( epsilon,\delta)$-DP 算法学习任何可以私下学习的类别。 我们的结果的关键部分是全球稳定化[Bun et al., FOSCS 2020]。 对于$(epsilon)/DP 算法,我们显示我们只能学习使用美元(d) $(d) 美元) 用户,即使在本地模型中,我们也可以学习。 $(d) 是概率代表的方面。 在这两种情况下,我们显示对用户人数的制约接近。 我们的一个关键部分是全球稳定化[Bun et al., FOSCS 2020] 。 对于美元算算法来说,我们只能使用公共随机性, 。 在这种精确的模型中,可以使用一种稳定性地显示, 使全球成本稳定成为一种稳定。