深度神经网络的函数耦合型水印 (On Function-Coupled Watermarks for Deep Neural Networks)

Well-performed deep neural networks (DNNs) generally require massive labelled data and computational resources for training. Various watermarking techniques are proposed to protect such intellectual properties (IPs), wherein the DNN providers implant secret information into the model so that they can later claim IP ownership by retrieving their embedded watermarks with some dedicated trigger inputs. While promising results are reported in the literature, existing solutions suffer from watermark removal attacks, such as model fine-tuning and model pruning. In this paper, we propose a novel DNN watermarking solution that can effectively defend against the above attacks. Our key insight is to enhance the coupling of the watermark and model functionalities such that removing the watermark would inevitably degrade the model's performance on normal inputs. To this end, unlike previous methods relying on secret features learnt from out-of-distribution data, our method only uses features learnt from in-distribution data. Specifically, on the one hand, we propose to sample inputs from the original training dataset and fuse them as watermark triggers. On the other hand, we randomly mask model weights during training so that the information of our embedded watermarks spreads in the network. By doing so, model fine-tuning/pruning would not forget our function-coupled watermarks. Evaluation results on various image classification tasks show a 100\% watermark authentication success rate under aggressive watermark removal attacks, significantly outperforming existing solutions. Code is available: https://github.com/cure-lab/Function-Coupled-Watermark.

翻译：----- 深度神经网络（DNN）通常需要大量标记数据和计算资源进行训练。各种水印技术被提出来保护这种知识产权（IP），其中DNN提供者将秘密信息嵌入模型中，以便稍后通过某些专门的触发输入检索其嵌入水印而声称IP所有权。虽然文献中报告了有希望的结果，但现有的解决方案受到删除水印的攻击的影响，例如模型微调和模型剪枝。在本文中，我们提出了一种新颖的DNN水印方案，能有效地抵御上述攻击。我们的关键见解是增强水印与模型功能之间的耦合性，使 entfer_watermarking_solutions不能带走水印，同时保持模型在常规输入下的性能。为此，与先前方法依赖于从分布外数据中学到的秘密特征不同，我们的方法只使用从分布内数据学到的功能。具体而言，一方面，我们建议从原始训练数据集中采样输入，并将其作为水印触发器进行融合。另一方面，我们在训练期间随机屏蔽模型权重，以便我们嵌入水印的信息在网络中传播。这样做，从容器微调/pruning不会忘记我们的功能耦合水印。各种图像分类任务的评估结果表明，在激烈的水印删除攻击下，水印认证成功率达到100％，明显优于现有解决方案。可用的代码: https://github.com/cure-lab/Function-Coupled-Watermark.