语言模式中的逐字记录性更正语言模式中的逐字记录性更正 (Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy)

Studying data memorization in neural language models helps us understand the risks (e.g., to privacy or copyright) associated with models regurgitating training data, and aids in the evaluation of potential countermeasures. Many prior works -- and some recently deployed defenses -- focus on "verbatim memorization", defined as a model generation that exactly matches a substring from the training set. We argue that verbatim memorization definitions are too restrictive and fail to capture more subtle forms of memorization. Specifically, we design and implement an efficient defense based on Bloom filters that perfectly prevents all verbatim memorization. And yet, we demonstrate that this "perfect" filter does not prevent the leakage of training data. Indeed, it is easily circumvented by plausible and minimally modified "style-transfer" prompts -- and in some cases even the non-modified original prompts -- to extract memorized information. For example, instructing the model to output ALL-CAPITAL texts bypasses memorization checks based on verbatim matching. We conclude by discussing potential alternative definitions and why defining memorization is a difficult yet crucial open question for neural language models.

翻译：在神经语言模型中研究数据记忆,有助于我们理解与培训数据模型重构模型相关的风险(如隐私或版权),以及评估潜在反措施的辅助工具。许多先前的作品 -- -- 以及最近部署的一些防御 -- -- 侧重于“虚拟记忆化”(verbatim memorization),被定义为与培训成套材料的子字符串完全吻合的模型一代。我们争辩说,逐字记录记忆化定义限制性过强,无法捕捉更微妙的记忆化形式。具体地说,我们设计和实施基于“闪光过滤器”的有效防御,完全防止所有逐字记录记忆化。然而,我们证明,这一“完美”过滤器并不能防止培训数据泄漏。事实上,很容易被一个合理和微小修改的“模式转移”提示所绕开来提取记忆化信息,在某些情况下,甚至是未经修改的原始提示。例如,指导模型输出“全美”文本时绕过基于逐字记录匹配的记忆化检查。我们的结论是,我们通过讨论潜在的替代定义和为什么界定记忆化模式是一个非常关键的问题。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日