好艺术家影印件、大艺术家偷窃:对图像翻译创创反网络的示范驱逐攻击 (Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Generative Adversarial Networks)

Machine learning models are typically made available to potential client users via inference APIs. Model extraction attacks occur when a malicious client uses information gleaned from queries to the inference API of a victim model $F_V$ to build a surrogate model $F_A$ that has comparable functionality. Recent research has shown successful model extraction attacks against image classification, and NLP models. In this paper, we show the first model extraction attack against real-world generative adversarial network (GAN) image translation models. We present a framework for conducting model extraction attacks against image translation models, and show that the adversary can successfully extract functional surrogate models. The adversary is not required to know $F_V$'s architecture or any other information about it beyond its intended image translation task, and queries $F_V$'s inference interface using data drawn from the same domain as the training data for $F_V$. We evaluate the effectiveness of our attacks using three different instances of two popular categories of image translation: (1) Selfie-to-Anime and (2) Monet-to-Photo (image style transfer), and (3) Super-Resolution (super resolution). Using standard performance metrics for GANs, we show that our attacks are effective in each of the three cases -- the differences between $F_V$ and $F_A$, compared to the target are in the following ranges: Selfie-to-Anime: FID $13.36-68.66$, Monet-to-Photo: FID $3.57-4.40$, and Super-Resolution: SSIM: $0.06-0.08$ and PSNR: $1.43-4.46$. Furthermore, we conducted a large scale (125 participants) user study on Selfie-to-Anime and Monet-to-Photo to show that human perception of the images produced by the victim and surrogate models can be considered equivalent, within an equivalence bound of Cohen's $d=0.3$.

翻译：机器学习模式通常通过推断 API 向潜在客户用户提供。当恶意客户使用从询问到受害人模型的推断 ALI $F_V$的信息, 以构建一个具有类似功能的代金模型 $F_A$。最近的研究显示,通过图像分类和 NLP 模型, 成功的模型提取攻击成功。在本文中, 我们展示了针对真实世界的基因对抗网络( GAN) 图像翻译模型的第一次模型提取攻击。我们提出了一个针对图像翻译模型进行模型提取攻击的框架, 并显示对手能够成功提取功能性代理模型 $F_V$F 。对手不需要知道 $F$V$的架构或任何有关它的其他信息, 具有类似功能。最近的研究显示, 与 $F_V$培训数据来自同一领域的数据。我们用三种不同的图像翻译实例来评估我们的攻击效果:(1) 自我向Anime3 和 (2) Monet-Photo 模型可以成功提取功能模型( IM 格式转换到 ) 自我定位到 AL- Ral- imal- imal- imal- imal imal imation imationsal impeal imations) imation imations impal impalalalalalalalalalalalalalalal imations:

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/