With the widespread use of deep neural networks (DNNs) in many areas, more and more studies focus on protecting DNN models from intellectual property (IP) infringement. Many existing methods apply digital watermarking to protect the DNN models. The majority of them either embed a watermark directly into the internal network structure/parameters or insert a zero-bit watermark by fine-tuning a model to be protected with a set of so-called trigger samples. Though these methods work very well, they were designed for individual DNN models, which cannot be directly applied to deep ensemble models (DEMs) that combine multiple DNN models to make the final decision. It motivates us to propose a novel black-box watermarking method in this paper for DEMs, which can be used for verifying the integrity of DEMs. In the proposed method, a certain number of sensitive samples are carefully selected through mimicking real-world DEM attacks and analyzing the prediction results of the sub-models of the non-attacked DEM and the attacked DEM on the carefully crafted dataset. By analyzing the prediction results of the target DEM on these carefully crafted sensitive samples, we are able to verify the integrity of the target DEM. Different from many previous methods, the proposed method does not modify the original DEM to be protected, which indicates that the proposed method is lossless. Experimental results have shown that the DEM integrity can be reliably verified even if only one sub-model was attacked, which has good potential in practice.
翻译:由于在许多领域广泛使用深神经网络,越来越多的研究侧重于保护DNN模型不受知识产权侵犯,许多现有方法采用数字水印法来保护DNN模型,大多数方法要么将水印法直接嵌入内部网络结构/参数中,要么插入零位水印法,通过微调一个模型来加以保护,并配有一套所谓的触发样品。虽然这些方法效果很好,但它们是为单个DNN模型设计的,不能直接应用于将多DNN模型结合起来以作出最后决定的深团混合模型。许多现有方法鼓励我们在本文中提出一个新的黑箱水印法方法,用于核查DNM模型的完整性。在拟议的方法中,通过模拟真实世界的德国马克攻击,分析拟议的非攻击德国马克和被攻击德国马克的子模型的预测结果,我们精心设计的数据集只能直接应用这些模型,甚至通过分析DME模型的预测结果,可以仔细核查这些德国马克的完整性。