In DP-SGD each round communicates a local SGD update which leaks some new information about the underlying local data set to the outside world. In order to provide privacy, Gaussian noise with standard deviation $\sigma$ is added to local SGD updates after performing a clipping operation. We show that for attaining $(\epsilon,\delta)$-differential privacy $\sigma$ can be chosen equal to $\sqrt{2(\epsilon +\ln(1/\delta))/\epsilon}$ for $\epsilon=\Omega(T/N^2)$, where $T$ is the total number of rounds and $N$ is equal to the size of the local data set. In many existing machine learning problems, $N$ is always large and $T=O(N)$. Hence, $\sigma$ becomes "independent" of any $T=O(N)$ choice with $\epsilon=\Omega(1/N)$. This means that our $\sigma$ only depends on $N$ rather than $T$. As shown in our paper, this differential privacy characterization allows one to {\it a-priori} select parameters of DP-SGD based on a fixed privacy budget (in terms of $\epsilon$ and $\delta$) in such a way to optimize the anticipated utility (test accuracy) the most. This ability of planning ahead together with $\sigma$'s independence of $T$ (which allows local gradient computations to be split among as many rounds as needed, even for large $T$ as usually happens in practice) leads to a {\it proactive DP-SGD algorithm} that allows a client to balance its privacy budget with the accuracy of the learned global model based on local test data. We notice that the current state-of-the art differential privacy accountant method based on $f$-DP has a closed form for computing the privacy loss for DP-SGD. However, due to its interpretation complexity, it cannot be used in a simple way to plan ahead. Instead, accountant methods are only used for keeping track of how privacy budget has been spent (after the fact).
 翻译:DP- SGD 每回合都会向外部世界发送本地的SGD更新信息, 该更新将一些关于本地基本数据集的新信息泄露给外部世界。 为了提供隐私, 在进行剪裁操作后, 以标准偏差$\gma$在本地的SGD更新中添加高方噪音。 在许多现有的机器学习问题中, $总是很大和$T=O美元。 因此, $gma$可以被选择等于$qrt{2( epsilon) = ln( delta) /\ epsilon} liver} $美元, $\ liver_Omegard$( T/N2) 美元。 也就是说, 我们的美元对于最精度的Omlifility (T/N2), 美元是全局性预算的总数和美元 美元 美元, 使用SSG 的计算方法, 美元, 美元的预估量的IMFlickral_ drode 。