Recently, how to protect the Intellectual Property (IP) of deep neural networks (DNN) becomes a major concern for the AI industry. To combat potential model piracy, recent works explore various watermarking strategies to embed secret identity messages into the prediction behaviors or the internals (e.g., weights and neuron activation) of the target model. Sacrificing less functionality and involving more knowledge about the target model, the latter branch of watermarking schemes (i.e., white-box model watermarking) is claimed to be accurate, credible and secure against most known watermark removal attacks, with emerging research efforts and applications in the industry. In this paper, we present the first effective removal attack which cracks almost all the existing white-box watermarking schemes with provably no performance overhead and no required prior knowledge. By analyzing these IP protection mechanisms at the granularity of neurons, we for the first time discover their common dependence on a set of fragile features of a local neuron group, all of which can be arbitrarily tampered by our proposed chain of invariant neuron transforms. On $9$ state-of-the-art white-box watermarking schemes and a broad set of industry-level DNN architectures, our attack for the first time reduces the embedded identity message in the protected models to be almost random. Meanwhile, unlike known removal attacks, our attack requires no prior knowledge on the training data distribution or the adopted watermark algorithms, and leaves model functionality intact.
翻译:最近,如何保护深神经网络(DNN)的知识产权(IP)成为AI行业的主要关切。为了打击潜在的盗版模式,最近的工作探索了各种水印战略,将秘密身份信息嵌入目标模型的预测行为或内部(例如,重量和神经激活),牺牲较少的功能和更多地了解目标模型,水印计划(例如,白箱模型水印记)的后一分支据称是准确、可信和安全的,以抵御最已知的去除水标记袭击,该行业正在展开的研究工作和应用。在本文中,我们首次展示了有效的清除攻击,这打破了几乎所有现有的白箱水标记计划,几乎没有性能管理,也没有必要的事先知识。通过分析神经元颗粒的这些IP保护机制,我们第一次发现它们通常依赖当地神经组的一套脆弱模型,所有这些都可以被我们提议的变异神经变系统变换的系统任意篡改。在9美元之前的系统分配模型中,几乎可以用来粉刷现有白箱水标本的系统,并且可以减少我们所知道的系统在标准上进行攻击性研究的系统。在9美元前的系统上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准内,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,在标准上,