Machine learning (ML) models are costly to train as they can require a significant amount of data, computational resources and technical expertise. Thus, they constitute valuable intellectual property that needs protection from adversaries wanting to steal them. Ownership verification techniques allow the victims of model stealing attacks to demonstrate that a suspect model was in fact stolen from theirs. Although a number of ownership verification techniques based on watermarking or fingerprinting have been proposed, most of them fall short either in terms of security guarantees (well-equipped adversaries can evade verification) or computational cost. A fingerprinting technique introduced at ICLR '21, Dataset Inference (DI), has been shown to offer better robustness and efficiency than prior methods. The authors of DI provided a correctness proof for linear (suspect) models. However, in the same setting, we prove that DI suffers from high false positives (FPs) -- it can incorrectly identify an independent model trained with non-overlapping data from the same distribution as stolen. We further prove that DI also triggers FPs in realistic, non-linear suspect models. We then confirm empirically that DI leads to FPs, with high confidence. Second, we show that DI also suffers from false negatives (FNs) -- an adversary can fool DI by regularising a stolen model's decision boundaries using adversarial training, thereby leading to an FN. To this end, we demonstrate that DI fails to identify a model adversarially trained from a stolen dataset -- the setting where DI is the hardest to evade. Finally, we discuss the implications of our findings, the viability of fingerprinting-based ownership verification in general, and suggest directions for future work.
翻译:机器学习(ML)模型需要大量的数据、计算资源和技术专门知识,因此,这些模型是宝贵的知识产权,需要保护免受想要偷取它们的对手的侵害。所有权核查技术使模型盗窃袭击的受害者能够证明一个嫌疑人模型事实上是从他们身上窃取的。虽然提出了基于水标记或指纹的一些所有权核查技术,但其中多数在安全保障(装备精良的对手可以逃避核查)或计算成本方面都存在缺陷。在ICLR '21, 数据集推断(DI)中引入的指纹技术已经证明比以前的方法更可靠、效率更高。所有权核查技术的作者们为线性(疑点)模型提供了正确性证据。然而,在同一背景下,我们证明,基于水标记或指纹的一些所有权核查技术存在很高的假阳性(FPs),它们大多在安全保障措施(装备精良性对手可以回避核查)或计算成本方面都存在缺陷。我们进一步证明,在现实的、非线假设模型(DI)中也触发了FPseral 。然后,我们从经验性地证实,DI导致直线(S)模型的正确无误地显示,从而显示,从常规的模型显示,在确定一个错误的模型中,我们最终的判断。</s>