Watermarking has become the tendency in protecting the intellectual property of DNN models. Recent works, from the adversary's perspective, attempted to subvert watermarking mechanisms by designing watermark removal attacks. However, these attacks mainly adopted sophisticated fine-tuning techniques, which have certain fatal drawbacks or unrealistic assumptions. In this paper, we propose a novel watermark removal attack from a different perspective. Instead of just fine-tuning the watermarked models, we design a simple yet powerful transformation algorithm by combining imperceptible pattern embedding and spatial-level transformations, which can effectively and blindly destroy the memorization of watermarked models to the watermark samples. We also introduce a lightweight fine-tuning strategy to preserve the model performance. Our solution requires much less resource or knowledge about the watermarking scheme than prior works. Extensive experimental results indicate that our attack can bypass state-of-the-art watermarking solutions with very high success rates. Based on our attack, we propose watermark augmentation techniques to enhance the robustness of existing watermarks.
翻译:水标记已成为保护DNN模型知识产权的倾向。 从对手的角度来看,最近的一些工程试图通过设计水标记清除攻击来破坏水标记机制。然而,这些攻击主要采用复杂的微调技术,这些技术具有某些致命的缺陷或不现实的假设。在本文中,我们提议从不同的角度进行新颖的水标记清除攻击。我们不是仅仅微调水标记模型,而是设计一个简单而有力的转换算法,将无法察觉的模式嵌入和空间级变异结合起来,从而能够有效和盲目地摧毁水标记模型与水标记样品的记忆化。我们还采用了轻量的微调战略来保存模型性能。我们的解决方案需要的资源或对水标记办法的了解比以前的工作少得多。广泛的实验结果表明,我们的攻击可以绕过最先进的水标记解决办法,而且成功率很高。根据我们的攻击,我们提出了水标记增强技术,以加强现有水标记的坚固性。