Differentially private SGD (DPSGD) has recently shown promise in deep learning. However, compared to non-private SGD, the DPSGD algorithm places computational overheads that can undo the benefit of batching in GPUs. Micro-batching is a common method to alleviate this and is fully supported in the TensorFlow Privacy library (TFDP). However, it degrades accuracy. We propose NanoBatch Privacy, a lightweight add-on to TFDP to be used on Graphcore IPUs by leveraging batch size of 1 (without microbatching) and gradient accumulation. This allows us to achieve large total batch sizes with minimal impacts to throughput. Second, we illustrate using Cifar-10 how larger batch sizes are not necessarily optimal from a privacy versus utility perspective. On ImageNet, we achieve more than 15x speedup over TFDP versus 8x A100s and significant speedups even across libraries such as Opacus. We also provide two extensions: 1) DPSGD for pipelined models and 2) per-layer clipping that is 15x faster than the Opacus implementation on 8x A100s. Finally as an application case study, we apply NanoBatch training for use on private Covid-19 chest CT prediction.
翻译:与非私人的 SGD 相比,DPSGD 算法将计算间接费用放置在可消除在GPU中批量好处的大型批量中。微吸附是一种常见的缓解方法,在TensorFlow隐私库中得到了充分支持。然而,它降低了准确性。我们提议在TensorFlow隐私库中采用NanoBatch Privacy(DPD),这是对TFDP的轻量级附加,通过调试模型和梯度积累,用于Greacore 议会联盟。这使我们能够达到总批量规模,对吞吐量影响最小。第二,我们用Cifar-10来说明从隐私角度和实用角度看,更大批量的尺寸不一定是最佳的。在图像Net上,我们比TFDP多15x速度超过8xA100s,甚至在Opacus等图书馆中也采用显著的加速速度。我们还提供两个扩展:(1) DPSGD用于编程模型,2) 每批量剪贴比Nanx速度快于8x A100s 私人预测应用的NanB 案例研究。最后应用了NanB 。