模型的开采和逆向可转移性, 你的BERT是脆弱的! (Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!)

Natural language processing (NLP) tasks, ranging from text classification to text generation, have been revolutionised by the pre-trained language models, such as BERT. This allows corporations to easily build powerful APIs by encapsulating fine-tuned BERT models for downstream tasks. However, when a fine-tuned BERT model is deployed as a service, it may suffer from different attacks launched by malicious users. In this work, we first present how an adversary can steal a BERT-based API service (the victim/target model) on multiple benchmark datasets with limited prior knowledge and queries. We further show that the extracted model can lead to highly transferable adversarial attacks against the victim model. Our studies indicate that the potential vulnerabilities of BERT-based API services still hold, even when there is an architectural mismatch between the victim model and the attack model. Finally, we investigate two defence strategies to protect the victim model and find that unless the performance of the victim model is sacrificed, both model ex-traction and adversarial transferability can effectively compromise the target models

翻译：自然语言处理(NLP)任务,从文本分类到文本生成等,都因诸如BERET等经过预先培训的语言模式而革命化了,使自然语言处理(NLP)任务,从文本分类到文本生成等,都受到诸如BERT等经过事先培训的语言模式的革命性改造。这样,公司就可以通过为下游任务包装经过微调的BERT模型,很容易地建立强大的APPS。然而,当一个经过微调的BERT模型作为一种服务部署时,它可能遭受恶意用户发动的不同攻击。在这项工作中,我们首先介绍对手如何在多个基准数据集上偷用基于BERT的API服务(受害者/目标模型),而事先知识和查询有限。我们进一步表明,所提取的模型和对抗性转移能力都可能导致对受害者模式的高度可转移的对抗性攻击。我们的研究表明,即使受害者模型与攻击模式在建筑上出现不匹配时,基于BERT的APE服务的潜在脆弱性仍然存在。最后,我们调查两种保护受害者模式的防御战略,并发现除非牺牲受害者模式的性,否则会损害受害者模式的外选和对抗性转移能力会有效损害目标模式。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/