Recent work shows that sensitive user data can be reconstructed from gradient updates, breaking the key privacy promise of federated learning. While success was demonstrated primarily on image data, these methods do not directly transfer to other domains such as text. In this work, we propose LAMP, a novel attack tailored to textual data, that successfully reconstructs original text from gradients. Our key insight is to model the prior probability of the text with an auxiliary language model, utilizing it to guide the search towards more natural text. Concretely, LAMP introduces a discrete text transformation procedure that minimizes both the reconstruction loss and the prior text probability, as provided by the auxiliary language model. The procedure is alternated with a continuous optimization of the reconstruction loss, which also regularizes the length of the reconstructed embeddings. Our experiments demonstrate that LAMP reconstructs the original text significantly more precisely than prior work: we recover 5x more bigrams and $23\%$ longer subsequences on average. Moreover, we are first to recover inputs from batch sizes larger than 1 for textual models. These findings indicate that gradient updates of models operating on textual data leak more information than previously thought.
翻译:最近的工作表明,敏感的用户数据可以从梯度更新中重建,从而打破联合学习的关键隐私承诺。 虽然成功主要表现在图像数据上,但这些方法并不直接转移到文本等其他领域。 在这项工作中,我们提议对文本数据进行新的攻击(LAMP),成功地用梯度来重建原始文本。我们的关键洞察力是用辅助语言模型来模拟文本的先前概率,用辅助语言模型来引导搜索更自然的文本。具体地说,LAMP引入了一个离散文本转换程序,将重建损失和先前文本概率都降到最低,正如辅助语言模型所提供的那样。该程序与重建损失的连续优化相替代,这也规范了重新嵌入的长度。我们的实验表明,LAMP对原始文本的重建比先前工作要精确得多:我们平均回收5x以上大rams和23 ⁇ $以上子序列。此外,我们首先从文本模型的批量大于1的批量数据中回收投入。这些结果显示,对文本数据泄漏量的模型进行梯度更新。