Training highly performant deep neural networks (DNNs) typically requires the collection of a massive dataset and the use of powerful computing resources. Therefore, unauthorized redistribution of private pre-trained DNNs may cause severe economic loss for model owners. For protecting the ownership of DNN models, DNN watermarking schemes have been proposed by embedding secret information in a DNN model and verifying its presence for model ownership. However, existing DNN watermarking schemes compromise the model utility and are vulnerable to watermark removal attacks because a model is modified with a watermark. Alternatively, a new approach dubbed DEEPJUDGE was introduced to measure the similarity between a suspect model and a victim model without modifying the victim model. However, DEEPJUDGE would only be designed to detect the case where a suspect model's architecture is the same as a victim model's. In this work, we propose a novel DNN fingerprinting technique dubbed DEEPTASTER to prevent a new attack scenario in which a victim's data is stolen to build a suspect model. DEEPTASTER can effectively detect such data theft attacks even when a suspect model's architecture differs from a victim model's. To achieve this goal, DEEPTASTER generates a few adversarial images with perturbations, transforms them into the Fourier frequency domain, and uses the transformed images to identify the dataset used in a suspect model. The intuition is that those adversarial images can be used to capture the characteristics of DNNs built on a specific dataset. We evaluated the detection accuracy of DEEPTASTER on three datasets with three model architectures under various attack scenarios, including transfer learning, pruning, fine-tuning, and data augmentation. Overall, DEEPTASTER achieves a balanced accuracy of 94.95%, which is significantly better than 61.11% achieved by DEEPJUDGE in the same settings.
翻译:高性能深度神经网络(DNN)的培训通常要求收集大规模数据集和使用强大的计算资源。 因此, 未经授权的私人预先训练的DNN 的再分配可能会给模型所有者造成严重的经济损失。 为了保护DNN模型的所有权, DNN 水印计划是通过将秘密信息嵌入 DNN模型并核实其存在模式所有权而提出的。 然而, 现有的DNN 水印计划会损害模型效用,并且容易受到水标记清除袭击。 或者, 引入了一种称为DEEEPJUDGE的新方法来测量嫌疑人模型和受害者模型之间的相似性。 然而, DEPJUDGE 只能设计来检测一个案例, 将一个嫌疑人模型的架构与受害者模型相同, 使用DEVTTER 来防止新的攻击情景模型, 将受害者数据被窃取到一个嫌疑人的模型, DTERD 能够有效地检测到这样的数据盗窃袭击, 将使用该模型的代码转换成一个目标 。