To protect sensitive training data, differentially private stochastic gradient descent (DP-SGD) has been adopted in deep learning to provide rigorously defined privacy. However, DP-SGD requires the injection of an amount of noise that scales with the number of gradient dimensions, resulting in large performance drops compared to non-private training. In this work, we propose random freeze which randomly freezes a progressively increasing subset of parameters and results in sparse gradient updates while maintaining or increasing accuracy. We theoretically prove the convergence of random freeze and find that random freeze exhibits a signal loss and perturbation moderation trade-off in DP-SGD. Applying random freeze across various DP-SGD frameworks, we maintain accuracy within the same number of iterations while achieving up to 70% representation sparsity, which demonstrates that the trade-off exists in a variety of DP-SGD methods. We further note that random freeze significantly improves accuracy, in particular for large networks. Additionally, axis-aligned sparsity induced by random freeze leads to various advantages for projected DP-SGD or federated learning in terms of computational cost, memory footprint and communication overhead.
翻译:为保护敏感培训数据,在深层学习中采用了不同的私人随机梯度下降(DP-SGD),以提供严格界定的隐私;然而,DP-SGD要求注入大量噪音,以坡度尺寸为尺度,导致与非私营培训相比性能大幅下降;在这项工作中,我们提议随机冻结,随机冻结将逐步冻结一组参数,并导致稀薄的梯度更新,同时保持或提高准确性;我们理论上证明随机冻结的趋同,发现随机冻结显示DP-SGD在各种DP-SGD框架中出现信号丢失和扰动适度交换。我们随机冻结在不同DP-SGD框架中实行随机冻结,在相同迭代数中保持准确性,同时达到高达70%的代表度,这表明在各种DP-SGD方法中存在着交易。我们还注意到,随机冻结大大提高了准确性,特别是大型网络的准确性。此外,随机冻结导致轴向一致的紧张性紧张性,导致在计算成本、记忆足迹和通信间接费用方面预期的DP-SGD或更新学习具有各种优势。