Learning often involves sensitive data and as such, privacy preserving extensions to Stochastic Gradient Descent (SGD) and other machine learning algorithms have been developed using the definitions of Differential Privacy (DP). In differentially private SGD, the gradients computed at each training iteration are subject to two different types of noise. Firstly, inherent sampling noise arising from the use of minibatches. Secondly, additive Gaussian noise from the underlying mechanisms that introduce privacy. In this study, we show that these two types of noise are equivalent in their effect on the utility of private neural networks, however they are not accounted for equally in the privacy budget. Given this observation, we propose a training paradigm that shifts the proportions of noise towards less inherent and more additive noise, such that more of the overall noise can be accounted for in the privacy budget. With this paradigm, we are able to improve on the state-of-the-art in the privacy/utility tradeoff of private end-to-end CNNs.
翻译:学习往往涉及敏感数据,因此,利用不同隐私的定义,开发了与Stochastic Gladient Emple(SGD)和其他机器学习算法的隐私保护扩展。在不同的私人 SGD 中,每次培训迭代所计算的梯度都受到两种不同类型的噪音的影响。首先,使用微型公用厕所产生的固有抽样噪音。第二,引入隐私的基本机制产生的加添加高斯噪音。在本研究中,我们发现这两类噪音对私人神经网络的效用具有同等影响,但在隐私预算中却没有同等计算。鉴于这一观察,我们建议了一个培训模式,将噪音的比例改变为较少固有和更多的添加性噪音,从而在隐私预算中可以更多地考虑到总体噪音。有了这一模式,我们可以改进私人端对端CNN的隐私/效用交易中的最新技术。