Privacy preserving deep learning is an emerging field in machine learning that aims to mitigate the privacy risks in the use of deep neural networks. One such risk is training data extraction from language models that have been trained on datasets , which contain personal and privacy sensitive information. In our study, we investigate the extent of named entity memorization in fine-tuned BERT models. We use single-label text classification as representative downstream task and employ three different fine-tuning setups in our experiments, including one with Differentially Privacy (DP). We create a large number of text samples from the fine-tuned BERT models utilizing a custom sequential sampling strategy with two prompting strategies. We search in these samples for named entities and check if they are also present in the fine-tuning datasets. We experiment with two benchmark datasets in the domains of emails and blogs. We show that the application of DP has a huge effect on the text generation capabilities of BERT. Furthermore, we show that a fine-tuned BERT does not generate more named entities entities specific to the fine-tuning dataset than a BERT model that is pre-trained only. This suggests that BERT is unlikely to emit personal or privacy sensitive named entities. Overall, our results are important to understand to what extent BERT-based services are prone to training data extraction attacks.
翻译:保护深层学习的隐私是机器学习中一个新兴领域,目的是减少使用深神经网络的隐私风险。这种风险之一是培训从语言模型中提取数据,这些模型经过了关于数据集的培训,包含个人和隐私敏感信息。在研究中,我们调查了在微调的BERT模型中命名实体的记忆范围。我们使用单标签文本分类作为具有代表性的下游任务,并在我们的实验中采用三种不同的微调设置,包括有差异隐私(DP)的设置。我们利用两种快速战略的定制顺序抽样战略,从微调的BERT模型中创建了大量文本样本。我们对这些样本进行搜索,检查这些样本是否也包含在微调数据集中。我们在微调的BERT模型中实验了两个实体的记忆。我们发现,应用DP对BERT的文本生成能力有巨大影响。此外,我们显示微调的BERT模型不会产生比BERT模型更具体的名称实体,而BERT模型只是预先培训过的两种提示。这表明,我们的隐私感光学程度是,因此不可能将个人测取结果。