Models need to be trained with privacy-preserving learning algorithms to prevent leakage of possibly sensitive information contained in their training data. However, canonical algorithms like differentially private stochastic gradient descent (DP-SGD) do not benefit from model scale in the same way as non-private learning. This manifests itself in the form of unappealing tradeoffs between privacy and utility (accuracy) when using DP-SGD on complex tasks. To remediate this tension, a paradigm is emerging: fine-tuning with differential privacy from a model pretrained on public (i.e., non-sensitive) training data. In this work, we identify an oversight of existing approaches for differentially private fine tuning. They do not tailor the fine-tuning approach to the specifics of learning with privacy. Our main result is to show how carefully selecting the layers being fine-tuned in the pretrained neural network allows us to establish new state-of-the-art tradeoffs between privacy and accuracy. For instance, we achieve 77.9% accuracy for $(\varepsilon, \delta)=(2, 10^{-5})$ on CIFAR-100 for a model pretrained on ImageNet. Our work calls for additional hyperparameter search to configure the differentially private fine-tuning procedure itself.
翻译:模型需要接受隐私保护学习算法的培训,以防止其培训数据中可能包含的敏感信息泄漏。 但是,像差异性私人随机梯度梯度下降(DP-SGD)这样的典型算法不能像非私人学习一样从模型规模中受益。 这表现在使用DP-SGD执行复杂任务时,隐私和公用事业(准确性)之间的不上诉取舍。 为了补救这种紧张关系,正在出现一种模式:对公共(例如,不敏感)培训数据中预先培训的模型的隐私差异进行微调。在这项工作中,我们确定了对差异性私人微调的现有方法的监督。它们并没有像非私人学习一样,根据隐私学习的具体情况调整微调方法。我们的主要结果是显示如何仔细选择在预先培训的神经网络中进行微调的层,从而使我们能够在隐私和准确性之间建立新的状态-艺术取舍。例如,我们为 $( varepslon, delta) 培训的模型(2, 10\-5) 实现77.9%的准确性私隐私私私隐私调。 智能模型的搜索程序。