Understanding when and how much a model gradient leaks information about the training sample is an important question in privacy. In this paper, we present a surprising result: even without training or memorizing the data, we can fully reconstruct the training samples from a single gradient query at a randomly chosen parameter value. We prove the identifiability of the training data under mild conditions: with shallow or deep neural networks and a wide range of activation functions. We also present a statistically and computationally efficient algorithm based on tensor decomposition to reconstruct the training data. As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy, especially in federated learning.
翻译:了解模型梯度泄漏培训样本信息的时间和程度是一个重要的隐私问题。 在本文中,我们提出了一个令人惊讶的结果:即使没有培训或对数据进行记忆化,我们也可以按照随机选择的参数值,从单一梯度查询中完全重建培训样本。我们证明培训数据在温和条件下是可识别的:有浅度或深层神经网络和广泛的激活功能。我们还提出了一个统计和计算高效的算法,其依据是用于重建培训数据的高压分解。作为显示敏感培训数据的可证实的攻击,我们的调查结果表明对隐私,特别是联合学习的隐私可能构成严重威胁。