好艺术家影印件、大艺术家偷窃:对图像翻译模型的示范驱逐式攻击</s> (Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Models)

Machine learning models are typically made available to potential client users via inference APIs. Model extraction attacks occur when a malicious client uses information gleaned from queries to the inference API of a victim model $F_V$ to build a surrogate model $F_A$ with comparable functionality. Recent research has shown successful model extraction of image classification, and natural language processing models. In this paper, we show the first model extraction attack against real-world generative adversarial network (GAN) image translation models. We present a framework for conducting such attacks, and show that an adversary can successfully extract functional surrogate models by querying $F_V$ using data from the same domain as the training data for $F_V$. The adversary need not know $F_V$'s architecture or any other information about it beyond its intended task. We evaluate the effectiveness of our attacks using three different instances of two popular categories of image translation: (1) Selfie-to-Anime and (2) Monet-to-Photo (image style transfer), and (3) Super-Resolution (super resolution). Using standard performance metrics for GANs, we show that our attacks are effective. Furthermore, we conducted a large scale (125 participants) user study on Selfie-to-Anime and Monet-to-Photo to show that human perception of the images produced by $F_V$ and $F_A$ can be considered equivalent, within an equivalence bound of Cohen's d = 0.3. Finally, we show that existing defenses against model extraction attacks (watermarking, adversarial examples, poisoning) do not extend to image translation models.

翻译：典型的提取攻击模式,当恶意客户使用从询问到受害人模型的推断值 $F_V$的数据,从受害人模型的查询到推断值 ATI $F_V$,以构建一个具有可比功能的代金模型 $F_A$。最近的研究表明,图像分类和自然语言处理模型的模型提取模型获得了成功。在本文中,我们展示了针对现实世界的基因对抗网络(GAN)图像翻译模型的第一次模型提取攻击。我们为进行这种攻击提供了一个框架,并展示了进行这种攻击的对手能够成功地提取功能替代模型,通过使用与F_V$培训数据相同的域数据查询$F_V$。敌人不需要了解F_V$的架构或超出其预期任务范围的其他信息。我们用两种流行图像翻译的三种不同实例来评估我们的攻击效果:(1) Sefieto-Anime和(2) Monet-toto图像(模拟风格转换),以及(3)超级分辨率(超级分辨率分辨率),使用标准性能测量到GANS-A的模型,我们向用户展示了一种大规模攻击的自我定位模型。</s>

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日