Classical differential private DP-SGD implements individual clipping with random subsampling, which forces a mini-batch SGD approach. We provide a general differential private algorithmic framework that goes beyond DP-SGD and allows any possible first order optimizers (e.g., classical SGD and momentum based SGD approaches) in combination with batch clipping, which clips an aggregate of computed gradients rather than summing clipped gradients (as is done in individual clipping). The framework also admits sampling techniques beyond random subsampling such as shuffling. Our DP analysis follows the $f$-DP approach and introduces a new proof technique which allows us to also analyse group privacy. In particular, for $E$ epochs work and groups of size $g$, we show a $\sqrt{g E}$ DP dependency for batch clipping with shuffling. This is much better than the previously anticipated linear dependency in $g$ and is much better than the previously expected square root dependency on the total number of rounds within $E$ epochs which is generally much more than $\sqrt{E}$.
翻译:经典的私人差异式DP-SGD采用随机的子取样方法进行个人剪切,这迫使采用小型的分抽样SGD方法。我们提供了一个超越DP-SGD的通用差异私人算法框架,允许任何可能的第一顺序优化方法(如古典SGD和以动力为基础的SGD方法)与批量剪切方法相结合,批量剪切方法包括计算梯度的总和,而不是(如在单项剪切中所做的那样)剪切除梯度。框架还承认除随机的分抽样技术以外的取样技术,例如打乱。我们的DP分析遵循了美元-DP方法,并采用了新的证明技术,使我们能够分析群体隐私。特别是,对于美元大和美元大小的组,我们显示了对批剪切的成依赖值$sqrt{g E}DP$。这比先前预计的美元线性依赖率要好得多,而且比原先预期的美元范围内的子弹的平根依赖率要好得多,通常超过$@sqrt{E}。