Neural language models are known to have a high capacity for memorization of training samples. This may have serious privacy implications when training models on user content such as email correspondence. Differential privacy (DP), a popular choice to train models with privacy guarantees, comes with significant costs in terms of utility degradation and disparate impact on subgroups of users. In this work, we introduce two privacy-preserving regularization methods for training language models that enable joint optimization of utility and privacy through (1) the use of a discriminator and (2) the inclusion of a triplet-loss term. We compare our methods with DP through extensive evaluation. We show the advantages of our regularizers with favorable utility-privacy trade-off, faster training with the ability to tap into existing optimization approaches, and ensuring uniform treatment of under-represented subgroups.
翻译:众所周知,神经语言模式具有高度的记忆能力,对培训样本进行记忆化处理,这在诸如电子邮件通信等用户内容培训模式时可能会对隐私产生严重影响; 差异隐私(DP),一种对具有隐私保障的模型进行培训的流行选择,在公用事业退化和对用户分组的不同影响方面成本巨大; 在这项工作中,我们为培训语言模式引入了两种保护隐私的规范化方法,以便通过(1) 使用歧视者,(2) 包括三重损失术语,联合优化使用和隐私; 我们通过广泛评估,将我们的方法与DP进行比较; 我们展示了我们的规范者的优势,包括有利的公用特权交换,更快的培训,能够利用现有的优化方法,确保代表不足的分组得到统一对待。