分析语言模型中的个人身份信息泄漏情况 (Analyzing Leakage of Personally Identifiable Information in Language Models) - 专知论文

会员服务 ·

0

攻击 · 信息泄漏 · 推断 · 语言模型 · 差分 ·

2023 年 4 月 4 日

Analyzing Leakage of Personally Identifiable Information in Language Models

翻译：分析语言模型中的个人身份信息泄漏情况

Nils Lukas,Ahmed Salem,Robert Sim,Shruti Tople,Lukas Wutschitz,Santiago Zanella-Béguelin

from arxiv, IEEE Symposium on Security and Privacy (S&P) 2023

Language Models (LMs) have been shown to leak information about training data through sentence-level membership inference and reconstruction attacks. Understanding the risk of LMs leaking Personally Identifiable Information (PII) has received less attention, which can be attributed to the false assumption that dataset curation techniques such as scrubbing are sufficient to prevent PII leakage. Scrubbing techniques reduce but do not prevent the risk of PII leakage: in practice scrubbing is imperfect and must balance the trade-off between minimizing disclosure and preserving the utility of the dataset. On the other hand, it is unclear to which extent algorithmic defenses such as differential privacy, designed to guarantee sentence- or user-level privacy, prevent PII disclosure. In this work, we introduce rigorous game-based definitions for three types of PII leakage via black-box extraction, inference, and reconstruction attacks with only API access to an LM. We empirically evaluate the attacks against GPT-2 models fine-tuned with and without defenses on three domains: case law, health care, and e-mails. Our main contributions are (i) novel attacks that can extract up to 10$\times$ more PII sequences than existing attacks, (ii) showing that sentence-level differential privacy reduces the risk of PII disclosure but still leaks about 3% of PII sequences, and (iii) a subtle connection between record-level membership inference and PII reconstruction.

翻译：语言模型已被证明可以通过句子级成员推断和重构攻击泄露有关训练数据的信息。了解语言模型泄露个人身份信息（PII）的风险得到了较少关注，这可以归因于错误的假设，即诸如数据清理等数据集策略足以防止PII泄露。清洗技术可以减少但不能完全防止PII泄露的风险：在实践中，清洗是不完美的，并且必须在最小化披露和保留数据集实用性之间平衡权衡。另一方面，目前尚不清楚针对句子级或用户级隐私设计的差分隐私等算法防御措施能够在多大程度上防止PII泄露。在本研究中，我们基于黑盒提取、推断和重构攻击引入了三种类型的PII泄漏的严格对抗定义, 并且仅使用LM的API访问。我们在三个领域上对经过和未经过防御的GPT-2模型进行了实证攻击：案例法、医疗保健和电子邮件。我们的主要贡献是(i) 新颖的攻击方法可以提取多达10倍的PII序列比现有攻击更多，(ii) 显示句子级差分隐私减少了PII披露的风险，但仍泄漏约3%的PII序列，以及(iii) 记录级成员推断和PII重构之间的微妙联系。

0

相关内容

【2023新书】实用数据隐私:增强数据的隐私性和安全性，599页pdf

【2023新书】实用数据隐私:增强数据的隐私性和安全性，599页pdf

专知会员服务

83+阅读 · 2023年5月1日

【2022新书】Python数据分析第三版，579页pdf

【2022新书】Python数据分析第三版，579页pdf

专知会员服务

252+阅读 · 2022年8月31日

【百度&北京大学】自然语言生成的保真性:分析、评价和优化方法的系统综述，Faithfulness in Natural Language Generation: A Systematic Survey of Analysis, Evaluation and Optimization Methods

【百度&北京大学】自然语言生成的保真性:分析、评价和优化方法的系统综述，Faithfulness in Natural Language Generation: A Systematic Survey of Analysis, Evaluation and Optimization Methods

专知会员服务

15+阅读 · 2022年3月11日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

机器学习隐私综述论文，An Overview of Privacy in Machine Learning

机器学习隐私综述论文，An Overview of Privacy in Machine Learning

专知会员服务

81+阅读 · 2020年5月20日

【剑桥大学博士论文】深层神经网络结构的复兴，147页pdf，The resurgence of structure in deep neural networks

【剑桥大学博士论文】深层神经网络结构的复兴，147页pdf，The resurgence of structure in deep neural networks

专知会员服务

20+阅读 · 2020年5月14日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

芝加哥大学计算机系助理教授Grant Ho招募计算机安全方向博士 / 硕士 / 实习生（2023 春 / 秋）

芝加哥大学计算机系助理教授Grant Ho招募计算机安全方向博士 / 硕士 / 实习生（2023 春 / 秋）

机器之心

0+阅读 · 2022年9月27日

缺失数据统计分析，第三版，462页pdf

缺失数据统计分析，第三版，462页pdf

专知

48+阅读 · 2020年2月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【论文推荐】最新八篇网络节点表示相关论文—可扩展嵌入、对抗自编码器、图划分、异构信息、显式矩阵分解、深度高斯、图、随机游走

【论文推荐】最新八篇网络节点表示相关论文—可扩展嵌入、对抗自编码器、图划分、异构信息、显式矩阵分解、深度高斯、图、随机游走

专知

14+阅读 · 2018年3月30日

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

专知

15+阅读 · 2018年2月3日

大规模在线社会网络社区发现及隐私保护研究

国家自然科学基金

1+阅读 · 2014年12月31日

公钥密码分析中的格方法与代数攻击

国家自然科学基金

0+阅读 · 2014年12月31日

外包数据的密文存储及查询的关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于生物Token和分形随机行走的VoIP网络安全通信理论与关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

保护数据隐私性及抗量子密码分析的可搜索加密研究

国家自然科学基金

1+阅读 · 2013年12月31日

分组密码和哈希函数的结构化分析

国家自然科学基金

0+阅读 · 2013年12月31日

围手术期脑语言网络功能连接重组和重塑机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

初中生社交网络使用强度对其心理健康影响的队列研究

国家自然科学基金

0+阅读 · 2012年12月31日

无线传感器网络密钥管理协议研究

国家自然科学基金

0+阅读 · 2011年12月31日

实用后量子线性公钥加密的错误嵌入与通用构造

国家自然科学基金

0+阅读 · 2011年12月31日

Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases

Arxiv

0+阅读 · 2023年5月22日

Inspecting and Editing Knowledge Representations in Language Models

Arxiv

0+阅读 · 2023年5月22日

Word differences in news media of lower and higher peace countries revealed by natural language processing and machine learning

Arxiv

0+阅读 · 2023年5月21日

A Survey of Federated Evaluation in Federated Learning

Arxiv

0+阅读 · 2023年5月19日

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

Arxiv

0+阅读 · 2023年5月19日

Numeric Magnitude Comparison Effects in Large Language Models

Arxiv

0+阅读 · 2023年5月18日

The Life Cycle of Knowledge in Big Language Models: A Survey

Arxiv

28+阅读 · 2023年3月14日

Causal Inference in Recommender Systems: A Survey and Future Directions

Arxiv

16+阅读 · 2022年8月26日

Data-Free Knowledge Transfer: A Survey

Arxiv

21+阅读 · 2021年12月31日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

VIP会员

文章信息

相关主题

相关VIP内容

【2023新书】实用数据隐私:增强数据的隐私性和安全性，599页pdf

【2023新书】实用数据隐私:增强数据的隐私性和安全性，599页pdf

专知会员服务

83+阅读 · 2023年5月1日

【2022新书】Python数据分析第三版，579页pdf

【2022新书】Python数据分析第三版，579页pdf

专知会员服务

252+阅读 · 2022年8月31日

【百度&北京大学】自然语言生成的保真性:分析、评价和优化方法的系统综述，Faithfulness in Natural Language Generation: A Systematic Survey of Analysis, Evaluation and Optimization Methods

【百度&北京大学】自然语言生成的保真性:分析、评价和优化方法的系统综述，Faithfulness in Natural Language Generation: A Systematic Survey of Analysis, Evaluation and Optimization Methods

专知会员服务

15+阅读 · 2022年3月11日

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

机器学习隐私综述论文，An Overview of Privacy in Machine Learning

机器学习隐私综述论文，An Overview of Privacy in Machine Learning

专知会员服务

81+阅读 · 2020年5月20日

【剑桥大学博士论文】深层神经网络结构的复兴，147页pdf，The resurgence of structure in deep neural networks

【剑桥大学博士论文】深层神经网络结构的复兴，147页pdf，The resurgence of structure in deep neural networks

专知会员服务

20+阅读 · 2020年5月14日

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

【Amazon】使用预先训练的Transformer模型进行数据增强，Data Augmentation using Pre-trained Transformer Models

专知会员服务

51+阅读 · 2020年3月7日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

芝加哥大学计算机系助理教授Grant Ho招募计算机安全方向博士 / 硕士 / 实习生（2023 春 / 秋）

芝加哥大学计算机系助理教授Grant Ho招募计算机安全方向博士 / 硕士 / 实习生（2023 春 / 秋）

机器之心

0+阅读 · 2022年9月27日

缺失数据统计分析，第三版，462页pdf

缺失数据统计分析，第三版，462页pdf

专知

48+阅读 · 2020年2月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【论文推荐】最新八篇网络节点表示相关论文—可扩展嵌入、对抗自编码器、图划分、异构信息、显式矩阵分解、深度高斯、图、随机游走

【论文推荐】最新八篇网络节点表示相关论文—可扩展嵌入、对抗自编码器、图划分、异构信息、显式矩阵分解、深度高斯、图、随机游走

专知

14+阅读 · 2018年3月30日

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

【论文推荐】最新6篇视觉问答（VQA）相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准

专知

15+阅读 · 2018年2月3日

相关论文

Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases

Arxiv

0+阅读 · 2023年5月22日

Inspecting and Editing Knowledge Representations in Language Models

Arxiv

0+阅读 · 2023年5月22日

Word differences in news media of lower and higher peace countries revealed by natural language processing and machine learning

Arxiv

0+阅读 · 2023年5月21日

A Survey of Federated Evaluation in Federated Learning

Arxiv

0+阅读 · 2023年5月19日

Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models

Arxiv

0+阅读 · 2023年5月19日

Numeric Magnitude Comparison Effects in Large Language Models

Arxiv

0+阅读 · 2023年5月18日

The Life Cycle of Knowledge in Big Language Models: A Survey

Arxiv

28+阅读 · 2023年3月14日

Causal Inference in Recommender Systems: A Survey and Future Directions

Arxiv

16+阅读 · 2022年8月26日

Data-Free Knowledge Transfer: A Survey

Arxiv

21+阅读 · 2021年12月31日

Pre-training Methods in Information Retrieval

Arxiv

16+阅读 · 2021年11月27日

相关基金

大规模在线社会网络社区发现及隐私保护研究

国家自然科学基金

1+阅读 · 2014年12月31日

公钥密码分析中的格方法与代数攻击

国家自然科学基金

0+阅读 · 2014年12月31日

外包数据的密文存储及查询的关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于生物Token和分形随机行走的VoIP网络安全通信理论与关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

保护数据隐私性及抗量子密码分析的可搜索加密研究

国家自然科学基金

1+阅读 · 2013年12月31日

分组密码和哈希函数的结构化分析

国家自然科学基金

0+阅读 · 2013年12月31日

围手术期脑语言网络功能连接重组和重塑机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

初中生社交网络使用强度对其心理健康影响的队列研究

国家自然科学基金

0+阅读 · 2012年12月31日

无线传感器网络密钥管理协议研究

国家自然科学基金

0+阅读 · 2011年12月31日

实用后量子线性公钥加密的错误嵌入与通用构造

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员