Recent work shows that sensitive user data can be reconstructed from gradient updates, breaking the key privacy promise of federated learning. While success was demonstrated primarily on image data, these methods do not directly transfer to other domains such as text. In this work, we propose LAMP, a novel attack tailored to textual data, that successfully reconstructs original text from gradients. Our attack is based on two key insights: (i) modeling prior text probability with an auxiliary language model, guiding the search towards more natural text, and (ii) alternating continuous and discrete optimization, which minimizes reconstruction loss on embeddings, while avoiding local minima by applying discrete text transformations. Our experiments demonstrate that LAMP is significantly more effective than prior work: it reconstructs 5x more bigrams and 23% longer subsequences on average. Moreover, we are the first to recover inputs from batch sizes larger than 1 for textual models. These findings indicate that gradient updates of models operating on textual data leak more information than previously thought.
翻译:最近的工作表明,敏感用户数据可以从梯度更新中重建,从而打破联合学习的关键隐私承诺。 虽然成功主要表现在图像数据上, 这些方法并不直接转移到文本等其他领域。 在这项工作中, 我们提议对文本数据进行新的攻击, 成功地从梯度中重建原始文本。 我们的攻击基于两个关键洞察力:(一) 用辅助语言模型模拟先前文本概率, 引导搜索走向更自然的文本, 并(二) 交替连续和离散优化, 从而通过应用离散文本转换来尽量减少嵌入的重建损失, 同时避免本地微缩。 我们的实验显示, LAMP 比先前的工作要有效得多: 平均重建5x以上大rams 和 23%以上子序列。 此外, 我们首先从文本模型大于1的批次大小的批量中回收投入。 这些结果显示, 文本数据操作模型的梯度更新显示, 泄露的信息比先前想象的更多。