Recent advances in deep learning have drastically improved performance on many Natural Language Understanding (NLU) tasks. However, the data used to train NLU models may contain private information such as addresses or phone numbers, particularly when drawn from human subjects. It is desirable that underlying models do not expose private information contained in the training data. Differentially Private Stochastic Gradient Descent (DP-SGD) has been proposed as a mechanism to build privacy-preserving models. However, DP-SGD can be prohibitively slow to train. In this work, we propose a more efficient DP-SGD for training using a GPU infrastructure and apply it to fine-tuning models based on LSTM and transformer architectures. We report faster training times, alongside accuracy, theoretical privacy guarantees and success of Membership inference attacks for our models and observe that fine-tuning with proposed variant of DP-SGD can yield competitive models without significant degradation in training time and improvement in privacy protection. We also make observations such as looser theoretical $\epsilon, \delta$ can translate into significant practical privacy gains.
翻译:最近深层次学习的进展大大改善了许多自然语言理解(NLU)任务的业绩,然而,用于培训NLU模型的数据可能包含私人信息,如地址或电话号码等,特别是从人的主题中提取的数据; 基础模型最好不暴露培训数据中所包含的私人信息; 提出了不同的私人小动物渐长后裔(DP-SGD),作为建立隐私保护模式的机制; 然而,DP-SGD培训的速度可能过于缓慢。 在这项工作中,我们提议使用GPU基础设施进行培训时采用效率更高的DP-SGD,并将其应用到基于LSTM和变异结构的微调模型中。 我们报告培训时间加快,加上准确性、理论隐私保障和会员身份攻击成功,我们发现对DP-SGD的拟议变型的微调可以产生竞争性模型,而不会显著降低培训时间和改进隐私保护。 我们还提出一些意见,例如较宽松的理论 $\epslon,\delta$可以转化为显著的实际隐私收益。