Sharing deep neural networks' gradients instead of training data could facilitate data privacy in collaborative learning. In practice however, gradients can disclose both private latent attributes and original data. Mathematical metrics are needed to quantify both original and latent information leakages from gradients computed over the training data. In this work, we first use an adaptation of the empirical $\mathcal{V}$-information to present an information-theoretic justification for the attack success rates in a layer-wise manner. We then move towards a deeper understanding of gradient leakages and propose more general and efficient metrics, using sensitivity and subspace distance to quantify the gradient changes w.r.t. original and latent information, respectively. Our empirical results, on six datasets and four models, reveal that gradients of the first layers contain the highest amount of original information, while the classifier/fully-connected layers placed after the feature extractor contain the highest latent information. Further, we show how training hyperparameters such as gradient aggregation can decrease information leakages. Our characterization provides a new understanding on gradient-based information leakages using the gradients' sensitivity w.r.t. changes in private information, and portends possible defenses such as layer-based protection or strong aggregation.
翻译:分享深神经网络的梯度而不是培训数据可以促进合作学习中的数据隐私。但在实践中,梯度可以披露私人潜值属性和原始数据。需要数学指标来量化培训数据所计算的梯度的原始和潜在信息渗漏。在这项工作中,我们首先对实证 $\ mathcal{V}$-信息进行修改,以提供攻击成功率的信息理论理由,而不是培训数据,然后从层到层的方式。然后我们更深入地了解梯度渗漏,提出更一般和高效的衡量标准,使用敏感度和子空间距离分别量化梯度变化的原始和潜在信息。我们在六个数据集和四个模型上的经验结果显示,第一个层的梯度包含最高原始信息量,而在地貌提取器后放置的分类/完全相连的层包含最高潜值信息。此外,我们展示了诸如梯度汇总等培训超参数如何减少信息渗漏。我们的定性提供了对基于梯度信息渗漏的新理解,使用梯度的梯度的敏感度、r.t/t-deform-development development development development grefrofrofrodefrofrogation, se-defrofrofrofrofroglationsmismism sqol-