Differentially private (DP) optimization is the standard paradigm to learn large neural networks that are accurate and privacy-preserving. The computational cost for DP deep learning, however, is notoriously heavy due to the per-sample gradient clipping. Existing DP implementations are $2-1000\times$ more costly in time and space complexity than the standard (non-private) training. In this work, we develop a novel Book-Keeping (BK) technique that implements existing DP optimizers (thus achieving the same accuracy), with a substantial improvement on the computational cost. Specifically, BK enables DP training on large models and high dimensional data to be roughly as efficient as the standard training, whereas previous DP algorithms can be inefficient or incapable of training due to memory error. The computational advantage of BK is supported by the complexity analysis as well as extensive experiments on vision and language tasks. Our implementation achieves state-of-the-art (SOTA) accuracy with very small extra cost: on GPT2 and at the same memory cost, BK has 1.0$\times$ the time complexity of the standard training (0.75$\times$ training speed in practice), and 0.6$\times$ the time complexity of the most efficient DP implementation (1.24$\times$ training speed in practice). We will open-source the codebase for the BK algorithm.
翻译:差异化私人(DP)优化是学习准确和隐私保护的大型神经网络的标准范例。但是,由于每个样本的梯度剪裁,DP深层学习的计算成本非常高昂。现有的DP执行在时间和空间复杂性方面成本比标准(非私人)培训高2-1000美元。在这项工作中,我们开发了新颖的图书保存(BK)技术,以实施现有的DP优化(从而实现同样的准确性),并大大改进计算成本。具体地说,BK使关于大型模型和高维度数据的DP培训与标准培训大致一样高效,而以前的DP算法可能因记忆错误而效率低下或无法进行培训。BK的计算优势得到复杂分析以及视力和语言任务方面的广泛实验的支持。我们的实施以非常小的额外费用实现了最新(SOTA)的准确性:GPT2和同样的记忆成本,BK使关于大型模型和高维度数据数据数据的DP培训与标准培训(0.75\time)的复杂时间(我们使用0.65美元的标准培训标准操作速度)。