An important problem in deep learning is the privacy and security of neural networks (NNs). Both aspects have long been considered separately. To date, it is still poorly understood how privacy enhancing training affects the robustness of NNs. This paper experimentally evaluates the impact of training with Differential Privacy (DP), a standard method for privacy preservation, on model vulnerability against a broad range of adversarial attacks. The results suggest that private models are less robust than their non-private counterparts, and that adversarial examples transfer better among DP models than between non-private and private ones. Furthermore, detailed analyses of DP and non-DP models suggest significant differences between their gradients. Additionally, this work is the first to observe that an unfavorable choice of parameters in DP training can lead to gradient masking, and, thereby, results in a wrong sense of security.
翻译:深思熟虑的一个重要问题是神经网络的隐私和安全。这两个方面早已分开审议。迄今为止,对加强隐私培训如何影响无核武器国家的稳健性仍不甚了解。本文试验性地评估了与差异隐私培训的影响,这是保护隐私的标准方法,针对广泛的对抗性攻击,对模式的脆弱性进行了实验性评估。结果显示,私人模式不如非私人模式强,而且对抗性范例在DP模式之间比在非私人模式和私人模式之间转移得更好。此外,对DP和非DP模式的详细分析表明,其梯度之间有显著差异。此外,这项工作首先发现,在DP培训中不受欢迎的参数选择可能导致梯度遮掩,从而导致一种错误的安全意识。