Machine learning models have shone in a variety of domains and attracted increasing attention from both the security and the privacy communities. One important yet worrying question is: will training models under the differential privacy (DP) constraint unfavorably impact on the adversarial robustness? While previous works have postulated that privacy comes at the cost of worse robustness, we give the first theoretical analysis to show that DP models can indeed be robust and accurate, even sometimes more robust than their naturally-trained non-private counterparts. We observe three key factors that influence the privacy-robustness-accuracy tradeoff: (1) hyperparameters for DP optimizers are critical; (2) pre-training on public data significantly mitigates the accuracy and robustness drop; (3) choice of DP optimizers makes a difference. With these factors set properly, we achieve 90\% natural accuracy, 72\% robust accuracy ($+9\%$ than the non-private model) under $l_2(0.5)$ attack, and 69\% robust accuracy ($+16\%$ than the non-private model) with pre-trained SimCLRv2 model under $l_\infty(4/255)$ attack on CIFAR10 with $\epsilon=2$. In fact, we show both theoretically and empirically that DP models are Pareto optimal on the accuracy-robustness tradeoff. Empirically, the robustness of DP models is consistently observed on MNIST, Fashion MNIST and CelebA datasets, with ResNet and Vision Transformer. We believe our encouraging results are a significant step towards training models that are private as well as robust.
翻译:机器学习模式在多个领域都有闪亮之处,并吸引了安全和隐私界越来越多的关注。一个重要但令人担忧的问题是:在不同的隐私(DP)限制下培训模型是否会对对抗性强力产生不利的影响?虽然以前的工作假设隐私以更差的稳健性为代价而出现,但我们首先进行了理论分析,以表明DP模式确实能够稳健和准确,有时甚至比经过自然训练的非私人模式强。我们观察到影响隐私-腐败-准确性交易的三个关键因素:(1) DP优化的超参数至关重要;(2) 对公共数据进行预先培训会大大降低准确性和稳健性下降的影响;(3) 选择DP优化性使差异更大。根据这些因素正确设定,我们实现了90-自然准确性,72-强的准确性(比非私人模式高9美元)在$(0.5美元)的攻击下,69-强的准确性(比非私人模式高16美元),我们经过培训的SimCLRV2模型在美元(4/255美元)下十分关键地降低准确性;我们一直对IMFAR10的准确性进行了精确性研究。