用两石块杀死一只鸟:对基于BERT的API的典型采掘和属性推断攻击 (Killing One Bird with Two Stones: Model Extraction and Attribute Inference Attacks against BERT-based APIs)

The collection and availability of big data, combined with advances in pre-trained models (e.g., BERT, XLNET, etc), have revolutionized the predictive performance of modern natural language processing tasks, ranging from text classification to text generation. This allows corporations to provide machine learning as a service (MLaaS) by encapsulating fine-tuned BERT-based models as APIs. However, BERT-based APIs have exhibited a series of security and privacy vulnerabilities. For example, prior work has exploited the security issues of the BERT-based APIs through the adversarial examples crafted by the extracted model. However, the privacy leakage problems of the BERT-based APIs through the extracted model have not been well studied. On the other hand, due to the high capacity of BERT-based APIs, the fine-tuned model is easy to be overlearned, but what kind of information can be leaked from the extracted model remains unknown. In this work, we bridge this gap by first presenting an effective model extraction attack, where the adversary can practically steal a BERT-based API (the target/victim model) by only querying a limited number of queries. We further develop an effective attribute inference attack which can infer the sensitive attribute of the training data used by the BERT-based APIs. Our extensive experiments on benchmark datasets under various realistic settings validate the potential vulnerabilities of BERT-based APIs. Moreover, we demonstrate that two promising defense methods become ineffective against our attacks, which calls for more effective defense methods.

翻译：大数据的收集和提供,加上预先培训的模型(如BERT、XLNET等)的进展,使现代自然语言处理任务的预测性表现发生了革命性的变化,从文本分类到文本生成等,使公司能够将机器学习作为一种服务(MLaaS),将基于BERT的精细调整的模型包装成API。然而,基于BERT的API展示了一系列安全和隐私脆弱性。例如,以前的工作通过提取模型制作的对抗性实例,利用了基于BERT的API的安全问题。然而,基于BERT的API的隐私渗漏问题并没有很好地研究。另一方面,由于基于BERT的API的API能力很高,因此,微调模型很容易被过度理解,但是,从提取模型中可以漏漏出的信息仍然未知。在这项工作中,首先通过有效的模型攻击,对手实际上可以窃取基于BERT的API(基于BER的API的目标/受害者攻击性)的隐私渗漏问题问题问题问题。另一方面,我们利用了一种敏感的BERA的精确性研究模型,因此只能用一种有限的数据库的机密性研究。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

不可错过！UIUC最新《对抗机器学习》课程，附PPT

专知会员服务

35+阅读 · 2020年12月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日